Educational Researchers Demand Cancellation of Spring 2021 Tests: Secretary Cardona Won’t Cancel, but Says In Future He May Reexamine Role of Testing

On Tuesday, in remarks at the annual legislative conference of the Council of Chief State School Officers, the new Secretary of Education, Miguel Cardona declared that he will not bow to pressure and will instead continue demanding that standardized tests be administrated this year again as usual, despite that COVID-19 has utterly upended another school year.  Last year Betsy DeVos cancelled the tests as schools shut down in March.

The Washington Post‘s Valerie Strauss reported on Wednesday: “A day after more than 500 education researchers asked Education Secretary Miguel Cardona not to force school districts to administer federally mandated student standardized tests this year during the coronavirus pandemic, Cardona said Tuesday that policymakers needed the data obtained from the exams…. (H)e said student data obtained from the tests was important to help education officials create policy and target resources where they are most needed… Cardona said Tuesday that he would be willing to ‘reexamine what role assessments’ play in education—but not immediately. ‘This is not the year for a referendum on assessments, but I am open to conversations on how to make those better,’ he said.”

On Monday, 548 researchers from the nation’s colleges of education sent a joint letter urgently asking Secretary Cardona to cancel the federally required standardized achievement tests in grades 3-8 and once in high school. America’s standardized testing regime was mandated in January of 2002 in the No Child Left Behind Act and, in 2015, folded into that law’s successor, the Every Student Succeeds Act.  The federal government set up the testing regime as the foundation of a massive school accountability scheme that ranked and rated America’s public schools and set out to turnaround (mostly through a cascade of sanctions) the poorest performing schools as measured by the tests. It was said that all of America’s youth would score “proficient” by 2014. Today we know that the law did not improve academic achievement overall and that it failed to close academic achievement gaps by race and family economics. In fact damage for students, their schools, and their teachers followed instead.

The letter, sent to Cardona on Monday from a large body of academic researchers in education, directly questions the value of forcing public schools to administer standardized tests this spring as being not only impractical and burdensome for school districts when some students are learning in class and others online, but unlikely to produce complete or reliable data. The letter was sent on behalf of the National Education Policy Center at the University of Colorado, Boulder, and the Beyond Test Scores Project, and authored by Jack Schneider at the University of Massachusetts, Lowell, and Lorrie Shephard, Michelle Renee Valladares, and Kevin Welner at the University of Colorado, Boulder. A list several pages long contains the names of 544 additional academic researchers.

Here are the concerns the researchers identify about the gathering of data through standardized tests this spring: “First, we strongly urge USED to work with states to approve requests for flexibility as they attempt to limit statewide testing, especially in states where significant numbers of students are still engaged in remote learning and where the state request has identified alternative data sources that can meet state needs.  This recommendation is based on the following: The results of remotely administered tests will not be equivalent to the results of in-person testing. Great variability in participation rates and non-random selection bias make it impossible to compare results across schools or between this year and previous years… (T)here is no way to prevent misinterpretation and misuse of these highly flawed data.”

The researchers also caution about the use of data, once gathered, from any administration of standardized tests this year: “We applaud USED’s recent decision to emphasize the importance of data for informational purposes, rather than high-stakes accountability. In light of research evidence, we wish to underscore the importance of continuing this practice in the future. For decades, experts have warned that the high-stakes use of any metric will distort results. Analyzing the impact of NCLB/ESSA, scholars have documented consequences like curriculum narrowing, teaching-to-the-test, the ‘triaging’ of resources, and cheating… The damage inflicted by racialized poverty on children, communities, and schools is devastating and daunting… Whatever their flaws, test-based accountability systems are intended to spotlight those inequalities and demand that they be addressed.  But standardized tests also have a long history of causing harm and denying opportunity to low-income students and students of color, and without immediate action they threaten to cause more harm now than ever.”

On Wednesday afternoon, Education Week‘s Andrew Ujifusa interviewed Secretary Cardona. Ujifusa asked about Cardona’s decision to continue standardized tests this year, while being willing to work with states and offer some degree of flexibility. In his answer, Cardona expresses some of the same concerns the researchers raise in Monday’s letter about the past two decades’ uses of standardized testing: “To be overly enamored by data is to be vulnerable to their misuse.  So we have to keep in perspective what the data will tell us and what it won’t tell us. It should never be even considered at this point for (labeling) schools as high-achieving schools, or low-achieving schools. We need to forget about that. We also shouldn’t be utilizing data for (educator) evaluations, because it’s not valid for that this year. However, as we’re rolling out $130 billion (in federal COVID-19 aid for schools), any data that can help state leaders think about policy and distribution of funds, to make sure that it’s aimed at closing achievement gaps and (addressing) lack of access to quality learning, that’s critically important. The team has been working at the agency, even before I joined, on flexibilities. We know that one size doesn’t fit all. We know in come places, they’ve been in schools since day one. In other places they’re just starting to get in. So flexibility is critically important.” (Parenthetical statements are Ujifusa’s.)

While Secretary Cardona seems to share some of the researchers’ concerns, we will need to observe his actions carefully in upcoming months as he takes over a federal department that has been mired for twenty years in a scheme organized to stigmatize and punish the schools and teachers serving poor children. These tests have never been used to drive the allocation of resources on a scale that would help the students in the school districts where our society’s poorest children are segregated. Will Cardona change a department which has tried to shape up low scoring schools by inducing states to punish and sometimes fire the principal and the teachers, or by imposing school closures or state takeovers, or by encouraging states to locate privatized charter schools or offer private school vouchers to students in those districts?

In his recent book, Schoolhouse Burning, Derek Black highlights the massive school funding inequity that has endured throughout the past twenty years of standardized, test-based school accountability: “(W)hen it comes to districts serving primarily middle income students, most states provide those districts with the resources they need to achieve average outcomes… The average state provides districts serving predominantly poor students $6,239 less per pupil than they need. (Schoolhouse Burning, p. 241)

Nobody Should Be Wasting Time Worrying About When to Administer Standardized Tests

Parents, children, teachers, principals, and school superintendents are living through a time of unknowns. COVID-19 is raging across the states with many public schools operating only online. Some public schools, which have been able to open in person or on hybrid schedules, have subsequently been forced to close already reopened buildings or specific classrooms as COVID-19 cases arise and everybody quarantines.

In the midst of a chaotic situation with no good and stable solutions for many public schools, suddenly last week everybody started worrying about what to do about this year’s standardized tests. The Washington Post‘s Perry Stein reports that outgoing Secretary of Education, Betsy DeVos postponed the winter administration of the National Assessment of Educational Progress, the one test administered across all the states, the test that tracks school achievement over the decades and is not distorted by high stakes consequences.

Representatives Bobby Scott (D-VA) and Patty Murray (D-WA), the Democratic leaders of the House Education Committee, agreed to delay the NAEP, but said the nation needs some kind of measure of learning loss during the pandemic.  They released a statement declaring that annual state tests mandated under the Every Student Succeeds Act must surely be administered: “Existing achievement gaps are widening for our most vulnerable students, including students from families with low incomes, students with disabilities, English learners, and students of color. In order for our nation to recover and rebuild from the pandemic, we must first understand the magnitude of learning loss that has impacted students across the country. That cannot happen without assessment data.”

While I frequently agree with Representatives Scott and Murray, I think worrying about standardized testing right now ought to be a low priority, and I think the state-by-state achievement tests mandated by the Every Student Succeeds Act are the wrong kind of test.  Neither do I believe that the mandated, annual state achievement tests are necessary to help teachers grasp their students’ learning needs during and following the widespread school closures and disruptions in the current school year.  Our schoolteachers are well trained professionals who are prepared to develop their students’ reading comprehension skills, to track problems with computational skills and mathematical conceptualization, and to help support their students emotionally after a period of disruption. The emphasis right now and when children return to classrooms must be supporting teachers facing the complex challenge of serving children who have been out of the classroom for too long. Standardized test scores very often don’t even arrive at schools for months after the tests are administered; they play little role in supporting teachers’ capacity to discern their students’ learning gains or losses.

If we are looking for complex data about the impact of the pandemic on public schools across communities and across states, at some point it will be realistic for the National Center for Education Statistics again to administer the National Assessment of Educational Progress, which is designed as a national audit test to determine learning trends over time.  When it is practical to administer NAEP, certainly that test should happen.

The annual standardized tests, mandated first by No Child Left Behind and, since 2015 by the Every Student Succeeds Act, are designed for an entirely different purpose.  And ironically the purpose and use of these tests for holding schools accountable distorts the results as schools struggle to raise scores at any cost in order to avoid the high stakes punishments that Congress attached to these tests or forced the states to attach. What are these high stakes? States still have to submit to the U.S. Department of Education plans for how to turnaround their lowest performing schools according to these tests.  Some states still evaluate teachers according to their students’ scores. States rate and rank particular schools and school districts according to their aggregate test scores. Many states publish these rankings, which encourages real estate redlining as well as racial and economic segregation across metropolitan areas. Different states place voucher programs or charter schools in school districts where scores are low. Some states take over low scoring schools and school districts and turn them over to appointed commissions that supplant locally elected school boards.  Some school districts have claimed to use school closure as a so-called turnaround plan.

In a profound 2017 book, The Testing Charade: Pretending to Make Schools Better, Daniel Koretz, a Harvard University expert on standardized testing, documents research exposing flaws in the entire strategy of No Child Left Behind, which combined standardized testing with high stakes punishments for schools unable quickly to raise students’ test scores. Koretz explains social scientist Don Campbell’s well-known theory describing the universal human response when high stakes are tied to a quantitative social indicator.  In this case, the social indicator is whether or not educators and particular schools can produce higher aggregate student test scores year after year:

“The more any quantitative social indicator is is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor… Achievement tests may well be valuable indicators of… achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.” (The Testing Charade, pp. 38-39)

Koretz shows that imposing high stakes punishments on schools and educators unable quickly to raise students’ scores inevitably produces reallocation of instruction to what is being tested, causes states eventually to lower standards, causes some schools quietly to exclude from testing the students likely to fail. Under No Child Left Behind, the high stakes even led to abject cheating—as happened in Atlanta under Superintendent Beverly Hall.

What all this means is that the state achievement tests mandated by No Child Left Behind and the Every Student Succeeds Act—whether administered to students this year or put off until after vaccines are widely available and students return to their classrooms—are not an appropriate tool for measuring the long term impact of the pandemic on students’ lives and learning.

Ideological advocacy for holding public schools accountable drove the passage and implementation of the original No Child Left Behind Act. The idea was that educators can be motivated to work harder through fear if their schools are threatened with punishments.  The idea of attaching high stakes consequences for low test scores remains with us today. Last week Chester E. Finn, Jr., formerly of the Thomas B. Fordham Institute and now affiliated with the Hoover Institution, published a widely read column in the Washington Post.  Twenty years ago, Finn strongly promoted No Child Left Behind’s test-and-punish strategy, and clearly he continues to believe in using high stakes testing as a threat. Here is a paragraph from his recent column that Finn could easily have cut, pasted, and slightly updated from something he wrote back in 2001:

“The results from those state assessments are the main source of information about school performance and about pupil learning in the core subjects of the K-12 curriculum. The results also indicate whether America’s appalling — and persistent — achievement gaps are getting any narrower. These student statewide test results are the foundation of a school-performance measurement structure that the United States has been painstakingly constructing in the decades since being declared “A Nation at Risk” in 1983. The information from the tests is used at every level of the system. It enables parents to see how their children are faring on an “external” metric, beyond the grades conferred by their teachers, and it helps principals assess how their schools are doing. The results also equip superintendents to gauge what must be done to boost district-wide achievement, and they furnish state officials with the information needed to guide their assistance and interventions.”

Today, nearly two decades after the states were mandated to administer annual standardized tests and after No Child Left Behind imposed sanctions on the schools with the lowest scores, we know that the whole scheme failed to support children’s school achievement and failed to close achievement gaps. Some schools were charterized as a punishment; other schools were shut down; principals and teachers were fired.  And scores on the national audit test, the National Assessment of Education Progress (the NAEP), have fallen in some cases and in other cases remained flat.

I believe it is unnecessary—in the midst of a raging pandemic and a Presidential transition—to worry about when the federal government will mandate widespread standardized testing.  The bigger question is whether and how the federal government will manage a plan to get the pandemic under control and provide enough support to help states and school districts get all children and adolescents back in school.

I agree with Diane Ravitch, who explains: “Resumption of standardized testing is completely ridiculous in the midst of a pandemic. The validity of the tests has always been an issue; their validity in the midst of a national crisis will be zero. They will show, even more starkly, that students who are in economically secure families have higher test scores than those who do not. They will show that children in poverty and children with disabilities have suffered disproportionately due to lack of schooling.  We already know that.  Why put pressure on students and teachers to demonstrate what we already know?  At this point, we don’t even know whether all students will have the advantage of in-person instruction by March.  If anything, we need a thorough review of the value, validity, and reliability of annual standardized testing, a practice that is unknown in any high-performing nation in the world.  We are choking on the rotten fumes of No Child Left Behind, Race to the Top, and the Every Student Succeeds Act.”

If High Stakes Standardized Testing Fades, Lots of Awful Punishments for Students, Teachers, and Schools Would Disappear

In yesterday’s Washington Post, Valerie Strauss published a very hopeful column: It Looks Like the Beginning of the End of America’s Obsession with Student Standardized Tests.  I hope she is right.  Her column covers current efforts to stop the requirement for college entrance exams and the wave of testing in primary and secondary public schools that was enshrined in the 2002 No Child Left Behind Act. This post will be limited to examining the implications of the mandated standardized testing that, for two decades, has dominated America’s K-12 public schools.

Strauss begins: “America has been obsessed with student standardized tests for nearly 20 years.  Now it looks like the country is at the beginning of the end of our high-stakes testing mania—both for K-12 ‘accountability’ purposes and in college admissions.  When President George W. Bush signed the K-12 No Child Left Behind Act in 2002, the country began an experiment based on the belief that we could test our way to educational success and end the achievement gap.  His successor, Barack Obama, ratcheted up the stakes of test scores under that same philosophy. It didn’t work, which came as no surprise to teachers and other critics. They had long pointed to extensive research showing standardized test scores are most strongly correlated to a student’s life circumstances.”

Strauss explains what’s different this year: “Now, we are seeing the collapse of the two-decade-old bipartisan consensus among major policymakers that testing was the key lever for holding students, schools and teachers ‘accountable.’ And it is no coincidence that it is happening aginst the backdrop of the coronavirus pandemic that has forced educational institutions to revamp how they operate.  States are learning that they can live without them, having been given permission by the Department of Education to not give them this past spring… Former vice president Joe Biden, who is the presumptive Democratic presidential nominee and ahead of Trump in many polls, has tried to distance himself from the pro-testing policies of the Obama administration. He was not a cheerleader of testing during Obama’s two terms and has said recently he is opposed to high-stakes testing.  That’s not a promise that he will work to reduce it, but it is a promising suggestion.”

Strauss publishes six principles from FairTest, the National Center for Fair and Open Testing, principles designed to guide state policy by reducing reliance on high-stakes testing:

  1. “Limit state standardized test requirements to no more than the minimum required by ESSA (the Every Student Succeeds Act that replaced No Child Left Behind) once each in reading and math in grades 3-8, plus once in high school, as well as one science test each in elementary, middle, and high school…
  2. “Seek federal waiver of testing requirements, at least for the 2020-2021 school year but preferably longer…
  3. “Terminate high-stakes consequences that rely on test scores for students (grade promotion tests, exit exams, course/program placement), teachers (bonuses, job ratings) and schools/districts (simplistic grading systems).
  4. “Protect young children by banning mass standardized testing before grade 3…
  5. “Enforce testing transparency and enhance public oversight…
  6. “Develop and implement performance-based assessment systems that enhance academic quality and equity by focusing on improvements in student work done over time.”

One of the most misunderstood issues about our current wave of testing is the impact of attaching high-stakes punishments to test scores. Test-and-punish was the central strategy of the No Child Left Behind Act.  It was assumed that, under the threat of sanctions, teachers would raise their expectations for their students and quickly raise test scores in even the public schools with low aggregate scores. You will remember that when the law passed in 2002, Congress gave America’s public schools a dozen years until which, by 2014, all American children were going to achieve proficiency.  Except it didn’t work.  We now know that Congress’s assumptions underneath No Child Left Behind failed to recognize many factors inside and outside of schools that affect standardized test scores.

In a profound book, The Testing Charade: Pretending to Make Schools Better, Daniel Koretz, a Harvard University expert on the design and uses of standardized testing, explores serious problems that arise when high stakes are attached to testing.  First there is social science research evidence that attaching high stakes punishments for teachers and public schools when scores don’t rise in fact distorts the test results and at the same time undermines in several ways the entire educational experience for both students and teachers: “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor… Achievement tests may well be valuable indicators of… achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the education process in undesirable ways.” (The Testing Charade, pp. 38-39)  In chapter after chapter, Koretz demonstrates that the consequences have been particularly devastating in schools where child poverty is concentrated; testing has narrowed the curriculum to the tested subjects, forced teachers to coach students and teach to the test, and even resulted in cheating by educators to make a school’s or school district’s scores look better.

Second, Koretz demonstrates that, because children in some schools start farther behind and face far greater obstacles, No Child Left Behind’s uniform timeline for the testing and the law’s application of high-stakes punishments embodies a bias against public schools in the poorest communities and their teachers: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)

I believe FairTest’s third principle is designed to undo the greatest damage wrought by two decades of high stakes testing: “Terminate high-stakes consequences that rely on test scores for students (grade promotion tests, exit exams, course/program placement), teachers (bonuses, job ratings) and schools/districts (simplistic grading systems).”  As states have undertaken to follow the dictates of No Child Left Behind, they have attached punishments for the schools and school districts where scores have failed to rise or where they have risen too slowly.  States have branded those schools and school districts as failures, and continued in several significant ways to punish the nation’s most vulnerable schools instead of providing support.  Across the United States, public schools in the poorest communities continue to receive less funding than the schools in America’s wealthiest and most exclusive suburbs.

Here are the high stakes punishments—always based primarily on aggregate students’ scores on standardized tests—that states persist in imposing on the schools and school districts where scores are low:

The Third Grade Guarantee:  Students who do not meet the standardized test cut score for “proficient” in reading are in many states held back for another year in third grade.  This is despite that research shows that students are developmentally ready to begin reading at very different ages and that forcing children to read in Kindergarten (as the Third Grade Guarantee has encouraged many schools to push) may cause students to struggle and to dislike reading.  Holding children back has also been shown eventually to increase the chance that a student will drop out before graduating from high school.

High School Exit Exams and Graduation Tests:  By denying high school diplomas to students who don’t pass a graduation exit exam, many states continue to punish high school students even if these students have passed all the required classes.

Teacher EvaluationsSome states continue, according to what they promised Arne Duncan, to evaluate teachers by their students’ aggregate standardized test scores.  When the Every Student Succeeds Act replaced No Child Left Behind, that federal agreement states had made to qualify for Race to the Top grants and No Child Left Behind waivers was dropped. Tying teachers’ evaluations to their students’ standardized test scores remains in states’ policies as a remnant of another era.

State Report Cards:  FairTest mentions “simplistic grading systems” for school districts. I believe these grading systems may be the most damaging negative consequence of high stakes testing because all sorts of other serious punishments cascade from the state report card grades.  States were required by No Child Left Behind to rate school districts and individual schools primarily by aggregate standardized test scores. Many states created school district report cards that award school districts and particular schools letter grades: “A” through “F.”  One of the most damaging consequences is that real estate sales websites like Zillow and Great Schools have adopted these state-awarded grades to brand specific communities as desirable places to live and to brand others as undesirable. Because aggregate standardized test scores correlate most highly with family income, the state report card grades—based largely on the each school district’s aggregate students’ test scores—have created educational redlining that is driving racial and economic segregation across America’s metropolitan areas.

School Closures:  One of the original “turnaround” models under No Child Left Behind was school closure.  Some school districts have found ways to shutter or phase out low scoring schools.  In June of 2013, Chicago closed 50 schools, with over 80 percent in African American neighborhoods.  Research from the University of Chicago’s Consortium on School Research showed that students didn’t do better on the whole in receiving schools.  A University of Chicago sociologist, Eve Ewing published a profound book, Ghosts in the Schoolyard: Racism and School Closings on Chicago’s South Side, about widespread grieving across one African American neighborhood when public schools which had served for years as community anchors were shut down.

Targeting Particular School Districts for Privatization:  Some states use aggregate standardized test scores to identify so-called “failing school districts” and then to enable children in those districts to qualify for private school tuition vouchers. Some states locate charter schools primarily in the school districts where aggregate standardized test scores are lower. Instead of investing more financial support for smaller classes and more staff in the public schools in those school districts, some states take the voucher dollars or the per-pupil state funding for each charter school student right out of the local school district budget.

State School District Takeovers:   State takeovers are the ultimate damaging consequence  of the punishments imposed by state legislatures on their poorest and lowest scoring school districts.  Over the years many states have seized low scoring schools or school districts, imposed autocratic, state appointed CEOs to manage the schools or turned over the schools to a “state achievement authority.” Gradually after the long failure of such state seizures of schools and whole school districts, the schools are being returned to locally elected school boards, but the damage to local schools and the disruption of communities is a long, sad story.

If you are searching for good books to explore while you are at home due to the pandemic, check out Daniel Koretz’s The Testing Charade and Eve Ewing’s Ghosts in the Schoolyard.

No Child Behind Failed, But Kevin Carey’s New Article Doesn’t Go Deep Enough to Explain Why

On Wednesday, Kevin Carey published an important piece in the Washington Post—a profile really of Amy Wilkins, currently the chief lobbyist for the National Alliance for Public Charter Schools, and formerly a lobbyist for many years at The Education Trust.  Carey, the Vice President for Education Policy at the New America Foundation, also worked for three years as a policy analyst at The Education Trust, from 2002-2005, in the years right after the passage of the No Child Left Behind Act.

In this week’s article, Carey accurately identifies The Education Trust, founded and directed for many years by Kati Haycock, as “a pro-school-reform organization.” He explains that The Education Trust’s mission grew out of the promises of the Civil Rights Movement—grounded not only in commitment to school integration, but endorsing the mission of the No Child Left Behind Act that test-based school accountability would ensure that schools better served black children, who had for generations been left behind.  The organization was a cheerleader for ending what was often described as the soft bigotry of low expectations: “National tests showed that white students were, on average, far surpassing their black and Latino peers, and that low-income students were falling behind. The Trust called this the ‘achievement gap.’… After the long, inconclusive battles for desegregated and well-funded schools, the federal government would finally ensure that the most disadvantaged students got the good schools they needed.”  The Education Trust also supported expanding school choice through the proliferation of charter schools.

It is significant that in his recent article Carey acknowledges the collapse of the two-decades-long national school accountability narrative. While Amy Wilkins hasn’t compromised her belief in test-based accountability and the creation of escapes for some children into charter schools, even Wilkins concedes a shift away from the vision she continues to endorse: “Amy Wilkins hasn’t given up on school reform.  She remains ‘struck by how politics allows the stubborn self-interest of adults to undermine again and again what’s right for poor kids and kids of color.’ But she says, ‘I have to believe we’re just at the wrong end of the pendulum swing.'”

In addition to profiling Wilkins, Carey also examines the ground shifting underneath public education policy. It is here where I believe his assessment falls short because he neglects to examine a mass of research demonstrating that disruptive, test-and-punish driven school reform has failed our nation’s poorest children.  And privatization through the expansion of charter schools has aggressively robbed the public schools that serve the mass of our children of essential dollars to keep class size small and to retain enough social workers, counselors, certified librarians and school nurses.

As evidence of a shift in the national narrative about education policy, Carey points to Elizabeth Warren’s education platform during her recent campaign for President—a proposal to end the federal Charter Schools Program and quadruple federal Title I funding for public schools serving concentrations of poor children: “Warren wasn’t the only politician who had turned hard against school reform. As the Democratic presidential candidates rolled out their platforms in 2019, they promoted unprecedentedly generous plans for education. Sen. Bernie Sanders called for tripling Title I funding and providing free prekindergarten for all. Former vice president Joe Biden also called for tripling Title I and free pre-K.  Meanwhile, school-reform ideas that had been staples of presidential agendas since the 1980s were nowhere to be found—unless they were being stridently denounced.”

So, what happened?  Carey traces pressure from schoolteachers who have consistently pushed back against the narrowing of the curriculum and the increased drilling that inevitably followed intense pressure to raise scores. Carey also reports on the failure of charter schools consistently to raise scores, the extremely disparate quality of charter schools, and the lack of transparency in these schools which are publicly funded but privately operated. He quotes Wilkins’  assessment of of her movement’s failures: “She… looks back on the school-reform tidal wave she helped unleash in 2001.  One crucial mistake, she says, was making all of NCLB’s consequences fall on individual teachers and schools, not the school districts and state education departments. And she says, ‘we should have been more aggressive about school funding equity. Far, far far more aggressive.'”

Carey’s own critique is deeper.  He explores the paltry fiscal investment Congress made in No Child Left Behind when it ramped up the emphasis on testing and punishing the schools unable quickly to raise scores.  And he reports on evidence that No Child Left Behind and the expansion of charter schools have neither significantly improved achievement overall nor closed achievement gaps: “Did school reform work?  High school graduation rates have improved over the past two decades, probably in response to accountability… NCLB produced modest bumps in student achievement on federal and state tests in the early ears.  Those gains, however, were concentrated in math in the early grades and seem to have plateaued or possibly reversed in recent years… As for charter schools studies have shown that they have not on average performed appreciably better than regular public schools.”  To his credit, Carey explains that mistrust threatens human relationships and institutions, and he criticizes No Child Left Behind for driving mistrust of teachers and public education in general. In fact, the law’s primary mechanism was to threaten educators with punishments if they could not produce ever higher test scores. It blamed schoolteachers for problems we now know they cannot control.

While  Carey is correct that support for the test-and-punish strategy of No Child Left Behind has waned and that skepticism is growing about the rapid expansion of charter schools, his analysis fails to explore several of the most important reasons for the failure of of the reforms The Education Trust endorsed.  Certainly his focus on Amy Wilkins narrows the issues he emphasizes.  Here are academic researchers addressing three problems Carey fails to address:

FIRST  In The Testing Charade: Pretending to Make Schools Better, Daniel Koretz, a Harvard University expert on standardized testing, documents research exposing flaws in the entire strategy of No Child Left Behind.  While Carey quotes Wilkins alleging that teachers should have been tougher and resisted pressures to narrow the curriculum and drill for the tests, Koretz describes social scientist Don Campbell’s well-known theory describing the universal human response when high stakes (in the case of No Child Left Behind–closing schools, charterizing schools, firing principals, firing teachers) are tied to a quantitative social indicator (the assumption that teachers can produce higher aggregate student test scores year after year): “The more any quantitative social indicator is is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor… Achievement tests may well be valuable indicators of… achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.” (The Testing Charade, pp. 38-39)  Koretz shows that imposing high stakes punishments on schools and teachers unable quickly to raise students’ scores inevitably produced reallocation of instruction to what would be tested, caused states eventually to lower standards, caused some schools quietly to exclude from testing the students likely to fail, and led to abject cheating—as happened in Atlanta under Superintendent Beverly Hall.

SECOND  Research has demonstrated not only that state legislatures have persistently underfunded their public schools, but also that the rapid expansion of charter schools has been draining millions of dollars out of the school districts where the charter schools are located.  The best documented example is in the Oakland Unified School District, where political economist Gordon Lafer reports that charter schools drain $57.3 million dollars annually out of the public schools.  Here’s why: “To the casual observer, it may not be obvious why charter schools should create any net costs at all for their home districts. To grasp why they do, it is necessary to understand the structural differences between the challenge of operating a single school—or even a local chain of schools—and that of a district-wide system operating tens or hundreds of schools and charged with the legal responsibility to serve all students in the community.  When a new charter school opens, it typically fills its classrooms by drawing students away from existing schools in the district…  If, for instance, a given school loses five percent of its student body—and that loss is spread across multiple grade levels, the school may be unable to lay off even a single teacher… Plus, the costs of maintaining school buildings cannot be reduced…. Unless the enrollment falloff is so steep as to force school closures, the expense of heating and cooling schools, running cafeterias, maintaining digital and wireless technologies, and paving parking lots—all of this is unchanged by modest declines in enrollment. In addition, both individual schools and school districts bear significant administrative responsibilities that cannot be cut in response to falling enrollment. These include planning bus routes and operating transportation systems; developing and auditing budgets; managing teacher training and employee benefits; applying for grants and certifying compliance with federal and state regulations; and the everyday work of principals, librarians and guidance counselors.” “If a school district anywhere in the country—in the absence of charter schools—announced that it wanted to create a second system-within-a-system, with a new set of schools whose number, size, specialization, budget, and geographic locations would not be coordinated with the existing school system, we would regard this as the poster child of government inefficiency and a waste of tax dollars. But this is indeed how the charter school system functions.”

THIRD  Despite many people’s hope that if public schools worked harder and smarter, our society could leave no child behind, it is now well documented that public schools by themselves cannot solve economic inequality and child poverty. David Berliner is the Regents’ professor emeritus at Arizona State University, former president of the American Educational Research Association and former dean of the College of Education at Arizona State University.  Berliner explains: “(T)he big problems of American education are not in America’s schools. So, reforming the schools, as Jean Anyon once said, is like trying to clean the air on one side of a screen door. It cannot be done!  It’s neither this nation’s teachers nor its curriculum that impede the achievement of our children. The roots of America’s educational problems are in the numbers of Americans who live in poverty. America’s educational problems are predominantly in the numbers of kids and their families who are homeless; whose families have no access to Medicaid or other medical services. These are often families to whom low-birth-weight babies are frequently born, leading to many more children needing special education… Our educational problems have their roots in families where food insecurity or hunger is a regular occurrence, or where those with increased lead levels in their bloodstream get no treatments before arriving at a school’s doorsteps. Our problems also stem from the harsh incarceration laws that break up families instead of counseling them and trying to keep them together. And our problems relate to harsh immigration policies that keep millions of families frightened to seek out better lives for themselves and their children…  Although demographics may not be destiny for an individual, it is the best predictor of a school’s outcomes—independent of that school’s teachers, administrators and curriculum.” “We certainly do not have the legally sanctioned apartheid of South Africa.  But we should recognize that we do have heavily segregated systems of housing. In New York and Illinois, over 60 percent of black kids go to schools where 90-100 percent of the kids are nonwhite and mostly poor.  In California, Texas and Rhode Island, 50 percent or more of Latino kids go to schools where 90-100 percent of the kids are also not white, and often poor. Similar statistics hold for American Indian kids.” (Emphasis in the original.)

To summarize the urgent realities that Carey omits from this week’s article but which, together, discredit twenty years of test-and-punish, accountability-based school reform, we can turn to the National Education Policy Center’s Bill Mathis and Tina Trujillo, a professor at the University of California at Berkeley, who explain that school reform must address the enormous disparities in opportunity among our children.  Such an an effort would address school funding inequity—the reason Democrats running for President this year have endorsed quadrupling or tripling the federal investment in Title I. It will also be necessary to define the problem not merely as an achievement gap, but instead as an opportunity gap:

“We cannot expect to close the achievement gap until we address the social and economic gaps that divide our society. No Child Left Behind had the explicit purpose of all children achieving high standards and thereby closing the achievement gap by 2014. It did not close. Noting the widening academic achievement gap between rich and poor, Sean Reardon found the gap ‘roughly 20 to 40 percent larger among children born in 2001 than among those born 25 years earlier… In an economic and social shift, he reports that family income is now nearly as strong a predictor as parental education. The income achievement gap, which is closely tied to the racial gap, is attributable to income inequality, the increased difficulty of social mobility, the bifurcation of wages and the economy, and a narrowing of school purposes driven by test taking… Low test scores are indicators of our social inequities… Otherwise, we would not see our white and affluent children soaring at the highest levels in the world and our children of color scoring equivalent to third-world countries. We also would not see our urban areas, with the lowest scores and greatest needs, funded well below our higher scoring suburban schools. With two-thirds of the variance in test scores attributable to environmental conditions, the best way of closing the opportunity gap is through providing jobs and livable wages across the board.”

Faith in High Stakes Testing Fades, Even Among the Corporate School Reformers

After a recent twenty-fifth anniversary conference at the Center on Reinventing Public Education at the University of Washington, Bothell—a Gates funded education-reformer think tank, Chalkbeat‘s Matt Barnum summarized presentations by a number of speakers who demonstrate growing skepticism about the high-stakes, standardized testing regime that has dominated American public education for over a quarter of a century.

Because the Center on Reinventing Public Education is known as an advocate for portfolio school reform and corporate accountability, you might expect adherence to the dogma of test-and-punish, but, notes Barnum:  “The pervasiveness of the complaints about testing was striking, given that many education reform advocates have long championed using test scores to measure schools and teachers and then to push them to improve.”

Then at a Massachusetts Institute of Technology School Access and Quality Summit early this month, Paymon Rouhanifard presented a major policy address challenging the use of high stakes testing to rank and rate public schools.  Rouhanifard was until very recently Chris Christy’s appointed, school-reformer superintendent in Camden, New Jersey.  Formerly he was the director in New York City of Joel Klein’s Office of Portfolio Management.  Rouhanifard describes the belief system he brought with him to Camden and describes how his five-year tenure as Camden’s superintendent transformed his thinking: “Our belief was that politics and bureaucracy had inhibited the progress Camden students and families deserved to overcome the steep challenges the city was facing…  We believed it was important for the district to segue out of being a highly political monopoly operator of schools….  This is a story about an evolution of my own thinking during that five-year experience…. What I’m referring to are the math and literacy student achievement data we utilize to drive so many of the critical decisions we make… My realization a few years ago was that I rarely asked questions about what these tests actually told us.  What they didn’t tell us.  And perhaps most importantly, what were the specific behaviors they incentivized, and what were the general trade-offs when we acutely focus on how students do on state tests.”

In 2013, at the beginning of his tenure, Rouhanifard introduced a school report card that rated each school primarily by students’ standardized test scores. Two years ago Rouhanifard eliminated his own school report cards.  He describes his realization: “We are spending an inordinate amount of time on formative and interim assessments and test prep, because those are the behaviors we have incentivized.  We are deprioritizing the sciences, the arts, and civic education…. I… believe the drawbacks currently outweigh the benefits.  That we haven’t been honest about the trade-offs.”

Shael Polakow-Suransky, like Rouhanifard, held a position in Joel Klein’s “reformer” school administration in New York City.  Now the president of Bank Street College of Education, he was formerly Klein’s former deputy schools chancellor. Barnum explains that Polakow-Suransky has become an emphatic critic of the nation’s high-stakes standardized testing regime: “The biggest barrier to student learning and closing the achievement gap is the current system of standardized tests.”

In a piece at The74, the  Thomas Fordham Institute’s Robert Pondiscio quotes Polakow-Suransky: “All of us were well-intentioned in pushing this agenda, but the tools we developed were not effective in raising the bar on a wide scale.”

While the Thomas Fordham Institute has endorsed corporate school reform including high-stakes, test-based accountability, Fordham’s Pondiscio now acknowledges that under the Every Student Succeeds Act, U.S. public schools have become mired in an education culture defined by test-based accountability.  Though he seems unclear on the way forward, Pondiscio now advocates for serious reconsideration: “The challenge is not testing vs. not testing.  It’s not accountability vs. none.  Both bring benefits of different kinds, and both are required by a federal law that’s not going to change anytime soon.  The challenge is to develop a policy vision that supports—not thwarts—the classroom practices and long-term student outcomes we seek… The problem is the reductive culture of testing, which has come to shape and define American education, particularly in the kinds of schools attended by our most disadvantaged children.”

There are some who remain faithful to the school reformer dogma. The Center on Reinventing Public Education’s Robin Lake tries to change the subject: “We need a more productive debate about school accountability, not tired arguments over testing.” And Matt Barnum quotes Sandy Kress—still a tried-and-true believer in the No Child Left Behind regime he helped create: “Research shows clearly that accountability made a real difference in this country in narrowing the achievement gap and lifting student achievement.”

Of course, research does not clearly show that Sandy Kress’s kind of No Child Left Behind accountability made a real difference.  Here is Harvard’s Daniel Koretz, in the authoritative book he published a year ago, The Testing Charade: Pretending to Make Schools Better.  It is perhaps this volume by an academic expert on testing that has helped change the minds of some of the corporate school reformers quoted above.  Koretz writes: “It is no exaggeration to say that the costs of test-based accountability have been huge.  Instruction has been corrupted on a broad scale.  Large amounts of instructional time are now siphoned off into test-prep activities that at best waste time and at worst defraud students and their parents.  Cheating has become widespread.  The public has been deceived into thinking that achievement has dramatically improved and that achievement gaps have narrowed.  Many students are subjected to severe stress, not only during testing but also for long periods leading up to it.  Educators have been evaluated in misleading and in some cases utterly absurd ways  Careers have been disrupted and in some cases ended.  Educators, including prominent administrators, have been indicted and even imprisoned.  The primary benefit we received in return for all of this was substantial gains in elementary-school math that don’t persist until graduation.  This is true despite the many variants of test-based accountability the reformers have tried, and there is nothing on the horizon now that suggests that the net effects will be better in the future. On balance, then, the reforms have been a failure.” (The Testing Charade, pp. 191-192)

Introducing readers to Don Campbell, “one of the founders of the science of program evaluation,” Koretz defines the problems inherent in our society’s quarter century of high-stakes, test-and-punish school accountability by quoting Campbell’s Law:  “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intend to monitor.”  Campbell directly addresses the problem of high stakes testing to rank and rate schools:  “Achievement tests may well be valuable indicators of … achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.” (The Testing Charade, pp. 38-39)

How has the testing regime operated perversely to undermine the schools serving our society’s most vulnerable children—the ones we were told No Child Left Behind would catch up academically if only we created incentives and punishments to motivate their teachers to work harder?  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools.  The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others.  Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do.  This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’  It was a deliberate and prominent part of may of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic  The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

Besides imposing unreasonable and damaging punishments on the schools and teachers serving our society’s poorest children, Koretz believes our commitment to a regime of punitive testing has distracted our society from developing the commitment to address the real needs of children and schools in places where poverty is concentrated: “We can undoubtedly reduce variations in performance appreciably, if we summoned the political will and committed the resources to do so—which would require a lot more than simply imposing requirements that educators reach arbitrary targets for test scores.” The Testing Charade, p. 131)

Rick Hess’s Mistake: Failure of Test-and-Punish Is Not Limited to a Few Districts That Have Disappointed

Frederick M. Hess, the director of education policy studies at the American Enterprise Institute, has always been a corporate education reform kind of guy. That is why Hess’s honest analysis this week of the ultimate fraud of a succession of school district miracles—Washington, D.C.’s test score and graduation rate miracle under Michelle Rhee and those who followed her, Alonzo Crim’s Atlanta in the 1980s, Houston’s Texas Miracle under Rod Paige, Arne Duncan’s Chicago, and Beverly Hall’s Atlanta—is so refreshingly candid.

In all of these cases, as Hess points out, there was “a remarkable dearth of attention paid to ensuring that the metrics (were) actually valid and reliable.”  Second, it was “tempting for civic leaders and national advocates to accept happy success stories at face value—especially when they (were) fronted by a charismatic superintendent.” And finally “reformers and reporters (made) things worse with their lust for ‘celebrity superintendents’ and ‘model systems.’ Their fascination nurtur(ed) an echo chamber in which a handful of leaders (got) exalted, often for too-good-to-be-true results.”

One must give Hess credit for honestly admitting the failure of so much of what his own kind of school reformers have been exalting for the past quarter century—business school accountability for schools, driven by universal standardized testing, and evaluated by two primary outcomes—standardized test scores and graduation rates. But Hess makes a mistake when he attributes the problem to a few “model” school districts that have disappointed.

Hess’s explanation is inadequate.  Inadequate because the system itself—the whole idea of school reform based on high stakes testing—cannot work.  Daniel Koretz, the Harvard specialist on testing, tells us why in a recent book: The Testing Charade: Pretending to Make Schools Better.

Koretz defines the problem with high-stakes-test-based school accountability by exploring a primary principle of social science research. Forty years ago, Don Campbell, “one of the founders of the science of program evaluation,” articulated a core principle now known as “Campbell’s Law”: “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (p. 38)

How does Campbell’s Law describe the dilemma Frederick Hess identifies?  Koretz quotes Don Campbell himself describing the distortion that will follow when high stakes consequences are attached to a school district’s capacity to raise its aggregate test scores: “Achievement tests may well be valuable indicators of… achievement under conditions of normal teaching aimed at general competence.  But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.” (p. 39)

In The Testing Charade, Koretz provides extensive evidence about all the ways high stakes tied to test scores have triggered Campbell’s Law—to invalidate the test results themselves and to undermine our education system and the experiences of teachers and students trapped by No Child Left Behind and the Every Student Succeeds Act in a scheme to raise test scores at all costs.

One consequence is score inflation: “All that is required for scores to become inflated is that the sampling used to create a test has to be predictable… For inflation to occur, teachers or students need to capitalize on this predictability, focusing on the specifics of the test at the expense of the larger domain.” (p. 62)  We read about all the ways curriculum designers and teachers are incentivized to focus their classes on the specific elements of any particular academic discipline that have appeared on previous tests.

A second consequence, related to the first, is flat-out test-prep. Test prep narrows what is taught to students to the material that is tested and drills students about using clues in the test itself to come up with the right answers. Koretz identifies three kinds of bad test prep. Reallocation between subjects has been common when schools emphasize No Child Left Behind’s tested subjects—reading and math—and cut back on social studies, the arts, music and recess. Reallocation within subjects is when schools study past years’ versions of the state tests and ask teachers to focus on particular aspects of a subject.  Finally there is coaching. Schools and test-prep companies teach students to respond in a formulaic way to the format of the questions themselves. Koretz explains why all this has implications for educational equity: “Inappropriate test preparation, like score inflation, is more severe in some places than in others. Teachers of high-achieving students have less reason to indulge in bad preparation for high-stakes tests because the majority of their students will score adequately without it—in particular, above the ‘proficient’ cut score that counts for accountability purposes. So one would expect that test preparation would be a more severe problem in schools serving high concentrations of disadvantaged students…. Once again, disadvantaged kids are getting the short end of the stick.” (pp. 116-117)

And a third consequence, demonstrated in every one of Frederick Hess’s examples is cheating. Koretz examines the biggest cheating scandals, notably Atlanta, Philadelphia, and Washington, DC.  He notes: “Cheating—by teachers and administrators, not by students—is one of the simplest ways to inflate scores, and if you aren’t caught, it’s the most dependable.” Sometimes teachers or administrators erase and change students answers; sometimes they provide teachers or students with the test items in advance; other times teachers give students the answer during the test.  And finally sometimes schools “scrub” off the enrollment rolls the students who are likely to fail.

Koretz presents the questions around cheating by educators as morally fraught. After all, test scores are not simply a proxy for the quality of a school or a school district:  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)

In a system that, by its very structure, is guaranteed to trigger Campbell’s Law, Koretz wonders about the moral implications of cheating: “Just who is responsible?  Is it just the people who actually carry out the fraud or require it?  Or are those who create the pressures to cheat also culpable, even if not criminally?” (p. 91)

Like Frederick Hess, Daniel Koretz recognizes that although outcomes-based, test-and-punish school accountability has been hyped and celebrated, ultimately this kind of school policy has not improved schools as promised.  Koretz digs deeper, however, to expose that the system itself—not merely its abuse by particular educators in particular school districts—is deeply flawed.

Koretz concludes: “It is no exaggeration to say that the costs of test-based accountability have been huge. Instruction has been corrupted on a broad scale. Large amounts of instructional time are now siphoned off into test-prep activities that at best waste time and at worst defraud students and their parents.  Cheating has become widespread. The public has been deceived into thinking that achievement has dramatically improved and that achievement gaps have narrowed. Many students are subjected to severe stress… The primary benefit we received in return for all of this was substantial gains in elementary-school math that don’t persist until graduation… On balance, then, the reforms have been a failure.” (pp. 191-192)

Repeating My Recommendation: Please Read Daniel Koretz’s Book, “The Testing Charade”

How has high stakes testing ruined our schools and how has this strategy, which was at the heart of No Child Left Behind, made it much more difficult to accomplish No Child Left Behind’s stated goal of reducing educational inequality and closing achievement gaps?

Here is how Daniel Koretz begins to answer that question in his 2017 book, The Testing Charade: Pretending to Make Schools Better: In 2002, No Child Left Behind “mandated that all states use the proficient standard as a target and that 100 percent of students reach that level. It imposed a short timeline for this: twelve years. It required that schools report the performance of several disadvantaged groups and it mandated that 100 percent of each of these groups had to reach the proficient standard. It required that almost all students be tested the same way and evaluated against the same performance standards.  And it replaced the straight-line approach by uniform statewide targets for percent proficient, called Adequate Yearly Progress (AYP)…. The law mandated an escalating series of sanctions for schools that failed to make AYP for each reporting group.” Later, “Arne Duncan used his control over funding to increase even further the pressure to raise scores.  The most important of Duncan’s changes was inducing states to tie the evaluation of individual teachers, rather than just schools, to test scores… The reforms caused much more harm than good. Ironically, in some ways they inflicted the most harm on precisely the disadvantaged students the policies were intended to help.”

Koretz poses the following question and his book sets out to answer it: “But why did the reforms fail so badly?”

I recommend Daniel Koretz’s book all the time as essential reading for anyone trying to figure out how we got to the deplorable morass that is today’s federal and state educational policy.  I wish I thought more people were reading this book. Maybe people are intimidated that its author is a Harvard expert on the design and use of standardized tests.  Maybe it’s the fact that the book was published by the University of Chicago Press. But I don’t see it in very many bookstores, and when I ask people if they have read it, most people tell me they intend to read it. To reassure myself that it is really worth reading, I set myself the task this past weekend of re-reading the entire book. And I found re-reading it to be extremely worthwhile.

The book divides into three parts—an introductory section of several chapters—six or seven chapters in the middle that dissect the way high stakes testing has undermined education and damaged the education of our nation’s poorest children—and some wrap-up chapters. It is the middle part that is essential. While Koretz has some ideas near the end about where we go from here, his analysis of the damage caused is the crucial part. After all, this section at the heart of the book addresses the conversational dilemma many readers of this blog must face as often as I do. What can you say to the person who doggedly tells you that a particular school is a fine school because its scores are high and another school is a failure because its test scores are so low? This person, often well-intentioned, has lived with test-based school accountability for so long that he cannot imagine there is any other way to consider school quality. And anyway, he says, standardized testing is what we have to evaluate schools, so it’s what we need to use.

Koretz explains a 40-year-old social science rule first articulated by Don Campbell, who Koretz identifies as “one of the founders of the science of program evaluation.” Here is how Campbell stated what we now call “Campbell’s Law”: “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” The rest of the central chapters in Koretz’s book explain precisely how the use of high stakes punishments tied to low test scores has triggered Campbell’s Law. What are the high stakes punishments?  First came the school turnarounds prescribed by No Child Left Behind —firing the principal and half the teachers, closing the school, charterizing the school.  Later Arne Duncan added the evaluation of teachers by students’ test scores—and schemes rewarding teachers whose students scored well and firing the teachers whose students post low scores. Koretz summarizes No Child Left Behind’s test-and punish strategy: “The reformers’ implicit assumption seemed to be that many teachers knew how to teach more effectively but were being withholding, and therefore confronting them with sanctions and rewards would be enough to get them to deliver.”

Three chapters explore how No Child Left Behind’s test-and-punish strategy has distorted schooling itself and has undermined how teachers teach and how students learn.

  • Score Inflation: When the state achievement tests mandated by No Child Left Behind—the ones that would bring negative consequences for schools and teachers—were compared by experts like Koretz himself to another “audit” test such as the National Assessment for Education Progress (NAEP), which has no high stakes consequences, the researchers discovered that while the scores on the state test rose rapidly, NAEP scores remained flat.  Koretz comments: “(I)ncreases in scores are meaningful only if they signal similar increases in mastery of the domain.  If they do generalize to the domain, gains should appear on other tests that sample from the same domain.” He continues: “(A)ll that is required for scores to become inflated is that the sampling used to create a test has to be predictable… For inflation to occur, teachers or students need to capitalize on this predictability, focusing on the specifics of the test at the expense of the larger domain.”  And there are equity concerns here, because score inflation has occurred more often in schools serving poor students: “Ongoing work by my own group has shown… that it is not just the poverty of individual students that predicts the amount of inflation but also the concentration of poor students in a school… (S)chools with a higher proportion of poor students showed greater average inflation.” Teachers under pressure are finding a way to raise test scores without really teaching the students the material they are supposed to be learning.  Some schools have also inflated overall scores by focusing primarily on children right at the pass/fail level and paying less attention to students far behind.
  • Cheating: Koretz examines the big cheating scandals, notably Atlanta, Philadelphia, and Washington, DC.  He notes: “Cheating—by teachers and administrators, not by students—is one of the simplest ways to inflate scores, and if you aren’t caught, it’s the most dependable.” Sometimes teachers or administrators erase and change students answers; sometimes they provide teachers or students with the test items in advance; other times teachers give students the answer during the test.  And finally sometimes schools “scrub” off the enrollment rolls the students who are likely to fail.
  • Test Prep: Test prep narrows what is taught to students to the material that is tested.  Koretz identifies three kinds of bad test prep. Reallocation between subjects has been common when schools emphasize No Child Left Behind’s tested subjects—reading and math—and cut back on social studies, the arts, music and recess. Reallocation within subjects is when schools study past years’ versions of the state tests and ask teachers to focus on particular aspects of a subject.  Finally there is coaching. Schools and test-prep companies teach students to respond in a formulaic way to the format of the questions themselves. Koretz explains why all this has implications for educational equity: “Inappropriate test preparation, like score inflation, is more severe in some places than in others. Teachers of high-achieving students have less reason to indulge in bad preparation for high-stakes tests because the majority of their students will score adequately without it—in particular, above the ‘proficient’ cut score that counts for accountability purposes. So one would expect that test preparation would be a more severe problem in schools serving high concentrations of disadvantaged students…. Once again, disadvantaged kids are getting the short end of the stick.”

Two chapters in this middle section explore the ways No Child Left Behind’s test-and-punish scheme has undermined equitable access to education in the schools in areas of concentrated poverty across our cities. The law that promised to leave no child behind not only encouraged test prep and cheating in the schools whose needs were greatest, but it also set impossibly tough and largely arbitrary test score targets for those schools and an impossibly short timeline for bringing students up to those targets.  And then the federal government set out to punish the schools and the teachers unable to meet the targets.

  • Making Up Unrealistic Targets: In this chapter, Koretz explains how No Child Left Behind’s standardized cut scores and timelines were set unrealistically and arbitrarily; the consequence was to label schools in poor areas as “failing” and to subject schools in areas of concentrated poverty to a series of punishments. Here is Koretz’s short summary: “Part of the blame for this failure lies with the crude and unrealistic methods used to confront inequity.  In a nutshell, the core of the approach has been simply to set an arbitrary performance target (the ‘Proficient’ standard) and declare that all schools must make all students reach it in an equally arbitrary amount of time.  No one checked to make sure the targets were practical.  The myriad factors that cause some students to do poorly in school—both the weaknesses of many of the schools they attend and the disadvantages some students bring to school—were given remarkably little attention. Somehow teachers would just pull this off… The trust most people have in performance standards is essential, because the entire educational system now revolves around them. The percentage of kids who reach the standard is the key number determining which teachers and schools will be rewarded or punished… But in fact, despite all the care that goes into creating them, these standards are anything but solid. They are arbitrary, and the ‘percent proficient’ is a very slippery number… A primary motivation for setting a Proficient standard is to prod schools to improve, but information about how quickly teachers actually can improve student learning doesn’t play much, if any, of a role in setting performance standards… However, setting the standards themselves is just the beginning. What gives the performance standards real bite is their translation into concrete targets for educators, which depends on more than the rigor of the standard itself… We have to say how quickly performance has to increase—not only overall but for different types of kids and schools. A less obvious but equally important question is how much variation in performance is acceptable.”
  • Evaluating Teachers: In 2009, beginning with Race to the Top and later as a condition for states to qualify for waivers from the worst consequences of No Child Left Behind, Arne Duncan’s Department of Education required states to change their laws to tie a percentage of teachers’ formal evaluations to students’ test scores. Myriad problems ensued. First of all, the required tests are in reading and math. What about the other teachers? Koretz describes Florida and Tennessee, which judged teachers in non-tested grades and subjects by the scores of students who were not in their classes, and in one case not in their schools.  Other states added tests in music, art, and physical education—subjecting students to added standardized testing—just for the purpose of state teacher evaluations.  Koretz explains the problems with Value-Added Modeling to evaluate teachers; many factors affecting students’ scores cannot be traced to any teacher and any teacher’s ratings seem to be unstable over several years.

I cannot imagine exactly how our society can recover from the our terrible test-and-punish misadventure and our labeling as “failing” the institutions and teachers who serve our poorest children.  What is heartening about The Testing Charade: Pretending to Make Schools Better is the clarity with which Daniel Koretz presents our current dilemma: “We now know what many educators did.  Faced with unrealistic targets, some cut corners or simply cheated.  And perhaps because the system, in its zeal to address inequities, made the targets most unrealistic for educators serving disadvantaged kids, those kids—ironically—got the worst of it: the most test prep, the most score inflation, and apparently the most cheating.  And yet inflated scores allowed policy makers to declare victory, and the public received a steady diet of encouraging but bogus news about rapid improvements in the achievement gap…. On balance… the reforms have been a failure.”

Please read The Testing Charade.  We all need to understand and be able to explain how we’ve gone so far astray.

Michelle Rhee’s D.C. Education Revolution Continues to Collapse

The Associated Press‘s Ashraf Khalil explains: “As recently as a year ago, the public school system in the nation’s capital was being hailed as a shining example of successful urban education reform and a template for districts across the country. Now…. after a series of rapid-fire scandals including one about rigged graduation rates, Washington’s school system has gone from a point of pride to perhaps the largest public embarrassment of Mayor Muriel Bowser’s tenure… A decade after a restructuring that stripped the decision-making powers of the board of education and placed the system under mayoral control, city schools in 2017 were boasting rising test scores and a record graduation rate for high schools of 73 percent, compared with 53 percent in 2011… Then everything unraveled.”

Teachers jobs were threatened if they didn’t raise test scores and increase graduation rates by passing students no matter what. Money was an incentive, with bonuses rewarding successful teachers. The Washington Post revealed that last year, while the district bragged about declining suspension rates, the reports were a fake. Many high schools were suspending students  without documentation.

Then an investigation by Washington, D.C.’s NPR station WAMU, uncovered that Ballou High School had been covering up massive student absences. Many students had been chronically absent and missed so many days of classes that mandatory course failure should have followed. But teachers failed to mark students absent, and students who were failing classes were being allowed to complete inadequate credit recovery projects so that they would pass courses and be allowed to graduate.

A district-wide investigation into practices during the 2016-2017 school year revealed that the problem wasn’t just at Ballou, but had spread district-wide. Khalil reports that, “about half of those Ballou graduates had missed more than three months of school and should not have graduated due to chronic truancy. A subsequent inquiry revealed a systemwide culture that pressured teachers to favor graduation rates over all else—with salaries and job security tied to specific metrics.  The internal investigation concluded that more than one-third of the 2017 graduating class should not have received diplomas due to truancy or improper steps taken by teachers or administrators to cover the absences.  In one egregious example, investigators found that attendance records at Dunbar High School had been altered 4,000 times to mark absent students as present. The school system is now being investigated by both the FBI and the U.S. Education Department.”

The school district responded this past winter by tightening attendance monitoring and enforcing course requirements.  In April of this year, the Washington Post‘s Perry Stein reported that, “Fewer than half of the seniors in the District’s traditional public school system are on track to receive their diplomas in June…. The city released a first batch of data in February, which showed that 42 percent of seniors attending traditional public schools were on track to graduate, while 19 percent were considered ‘moderately off track.'”

Last week Stein updated the story, reporting that 60 percent of seniors earned diplomas on time this month: “The school system said this week that 415 students who were considered ‘moderately off track’ in April received their diplomas in June. Forty students who were ‘significantly off track’ graduated… Many of the off-track students enrolled in credit-recovery courses to graduate on time.”  She adds that some students are likely to graduate after completing summer school.

Stein tracks the graduation rates by student demographics and finds them to be predictable and unfortunate: “At Banneker High, a selective-application school, 99 percent of seniors graduated this month—the highest rate in the District. The lowest rates belonged to Anacostia (42 percent), Coolidge (44 percent) and Ballou (45 percent).

In a puzzling development, Stein reports: “The D.C. Council passed emergency legislation last week allowing high school seniors who missed more than six weeks of class to receive their diplomas by discounting absences from the first three quarters of the school year.” The members of the Council are reported to have reasoned that students shouldn’t be held accountable for attendance rules that were tightened only after the graduation crisis was discovered.  Mayor Muriel Bowser has said she is not supportive of the emergency law, but she has not yet made a choice to sign or not to sign it.  Stein added : “But Bowser, whose signature is necessary for the reprieve to go into effect, has said she opposes it.  She said last week she is considering her options, and her office said Wednesday there is no update on her decision.”  If the law is signed, students will receive their diplomas late, as commencement exercises have already taken place.

What has been occurring in the D.C. public schools in recent years—allowing students to miss so much school they become chronically absent and hiding the fact that they are not in school—assigning short and easy credit recovery projects when students are failing classes—has been driven by promises to make the D.C. Public Schools a model. Michelle Rhee set up a system to pressure teachers and school administrators to by threatening to fire those who failed and paying merit bonuses to those who can make themselves look successful.  In a new book, The Testing Charade: Pretending to Make Schools Better, Harvard University’s Daniel Koretz explains why making graduation rates the primary measure of success, will taint the process and undermine the results. Koretz writes about Campbell’s Law, a well known principle in the social sciences: “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (The Testing Charade, p. 38)

When Michelle Rhee had been the leader of the D.C. Public Schools for several years, John Merrow, now retired from the PBS NewsHour, documented a major test score cheating scandal driven by Rhee’s demand that teachers raise test scores and her techniques for getting scores to rise: making teachers’ and principals’ evaluations, hiring and firing, and merit bonuses depend on educators’ capacity to raise scores. Later under former Chancellor Kaya Henderson, rising graduation rates became a second primary metric.  Koretz explains how Campbell’s law actually works: “(W)hen you hold people accountable using a numeric measure—vehicle emissions, scores on a test, whatever—two things generally happen: they do things you don’t want them to do, and the measure itself becomes inflated, painting too optimistic a view of whatever it is that the system is designed to improve.”(The Testing Charade, p. 38)  Koretz elaborates: “(Y)ou can take Campbell’s Law to the bank. It’s going to show up in any high-pressure accountability system that is based only on a few hard numbers.” (The Testing Charade, pp. 46-47)

Koretz predicts the consequences of the kind of school reform Michelle Rhee brought to Washington, D.C.:  cheating, lowering standards—what has happened in the D.C. graduation scandal, and excluding people with bad numbers—what may have happened in the D.C. schools when suspension rates were fudged.

Fine “Washington Post” Piece Traces Collapse of Michelle Rhee’s D.C. Legacy

In January of 2002, the No Child Left Behind Act was signed into law, establishing a high stakes testing regime with all children tested in grades 3-8 and once in high school. Test-and-punish school accountability meant annual testing and also a set of punishments for so-called failing schools and their staffs. The punishments eventually put in place were closing schools, firing teachers and principals, and privatizing or charterizing schools. States were eventually required to use students’ standardized test scores as a significant percentage of their formal evaluation process for teachers. The assumption behind all this was that incentives and punishments would make educators work harder and that standardized test scores would rise and achievement gaps would close. But test scores didn’t rise and achievement gaps didn’t close.

No school district epitomized this sort of data-driven, standardized test-based school reform like Washington, D.C.  In 2007, Michelle Rhee was brought in as appointed schools chancellor by Adrian Fenty, a new mayor who was given authorization for mayoral control of the school district. Fenty and his appointed chancellor created the grand illusion of success through mayoral governance and data-driven school reform. Washington, D.C. was said to be the symbol of school district turnaround.  Now we know most of it was a mere illusion.

Last weekend, three reporters for the Washington Post collaborated to trace the history of the supposed Washington, D.C. school miracle and summarize the tragic results: “In the decade after the city dissolved its elected local school board and turned management of the schools over to the mayor, Rhee and her successor, Kaya Henderson, created a system that demanded ever-higher accomplishments—higher test scores, higher graduation rates. They used money as an incentive: Principals and teachers were rewarded financially if they hit certain numbers. And with only weak oversight from the D.C. Council and other city education agencies—which report to the same mayor who is politically liable for the schools—there was no strong check on any impulse to gloss over shortcomings and pump up numbers. City lawmakers repeatedly boasted that the District’s schools had become the fastest-improving in the nation. Philanthropic dollars poured in… And one of the most dysfunctional school systems in America became known as a model for education reform efforts nationwide.”

Here is what the Post‘s reporters conclude: “If there is any simple truth about urban school reform, it may be this: It’s really hard. There are no miracles. The District’s scores have risen faster on national math and reading tests than anywhere else, but the improvements were driven in part by an influx of affluent families who enrolled children in the schools, helping boost scores. City officials invested billions of dollars to construct gleaming buildings, but that did not help close what remains the largest achievement gap between black and white students in a major U.S. city.”

The latest scandal, a subject this blog has previously covered, is a massive graduation rate crisis, where students in the city’s poorest high schools have been pushed toward graduation despite a pattern of chronic absence and teachers allowing students to make up work through short extra-credit assignments and superficial credit recovery programs. Now that officials have begun investigating and enforcing attendance and course completion requirements, it has become clear that the District’s graduation rate will plummet this year.

But there have been earlier warning signs.

Last weekend’s Washington Post report describes a history of practices aimed at improving the district’s appearance, if not the reality for its students:

  • “The District claimed a dramatic decline in suspensions, but a Washington Post investigation last summer showed that many city high schools were suspending students off the books, kicking students out without documentation—and in some cases even marking them present.”
  • Then there was the recent firing of the District’s newest Chancellor, Antwan Wilson, when he jumped a lottery waiting list to get his own daughter into the District’s highest scoring high school. Wilson had himself created some of the rules to tighten up on what had been a practice of letting powerful parents use their influence to secure special admissions for their own children.
  • A 2015 report by the National Research Council found that, “Eight years after Rhee’s arrival, and five years after her departure, poor and minority students were still far less likely to have an effective teacher in their classroom and perform at grade level.  Achievement gaps were as wide as ever.  About 60 percent of poor black students were below proficient in math and reading and had made only marginal gains since the changes were made.”
  • The reporters gloss over a significant cheating scandal under Michelle Rhee; it was difficult for reporters to conclusively document it because Rhee herself controlled the investigation.  The retired PBS reporter, John Merrow has amassed the evidence, however.

The Washington, D.C. public schools have been the nation’s poster child for the idea that schools themselves can change the trajectory of children’s lives, and that test scores are the mark of a school’s success or failure.  In his new book, The Testing Charade: Pretending to Make Schools Better, Harvard’s Daniel Koretz demonstrates the problem with that assumption:

“One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary… (T)his decision backfired. The result was, in many cases, unrealistic expectations that teachers simply couldn’t meet by any legitimate means.” (pp. 129-134)

Challenging another of Michelle Rhee’s assumptions—the one about driving school reform through punishment, firing, and merit bonuses— Daniel Koretz attributes the kind of deception that has happened in Washington, D.C. to a well-known principle in the social sciences:  “The more any quantitative social indicator is used for social decision making the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (p. 38)

Michelle Rhee set up a system in which educators were incentivized almost exclusively through carrots and sticks to meet ever rising demands. Rhee created a teacher evaluation process that either rewarded or fired teachers and principals according to the test score and graduation rate increases they produced.  Last weekend’s Washington Post evaluation of the past decade’s D.C. school reform depicts the details of the kind of pressure that Rhee and her successors have put on the District’s educators: “The District’s teachers are among the highest paid in the nation and can earn merit bonuses. In exchange, they also are more vulnerable to losing their jobs than teachers just about anywhere else.  Since 2007, hundreds have been fired.  Dozens of schools have been closed.  Other struggling schools have been ‘reconstituted,’ meaning everyone had to reapply for their jobs and many were not rehired.”  The reporters describe the annual “goal meeting” every principal was required attend. Each year principals, meeting with their own superiors, were forced to promise they and their teachers would meet goals set by higher-ups, goals that leaders at individual schools knew were not realistic. “The focus on data carried the promise of a scientific approach to improvement.  But it came with fierce pressure to produce gains that critics said failed to take into account the influences on a child’s life outside of school.”

In Washington, D.C., each school’s accomplishments in raising test scores and each high school’s progress in raising graduation rates have been tracked by data. Merit bonuses have been tied to records of raising scores and raising graduation rates, but principals and teachers have been fired if they couldn’t raise test scores and graduation rates.  People under pressure found ways to meet the targets.

Now, as the Washington Post reporters conclude: “The revelations—coupled with the resignation of the chancellor after his own personal scandal and separately, allegations of enrollment fraud at one of the city’s most sought-after selective high schools—have shattered the simple narrative of success. Now, there is a groundswell of skepticism among parents, taxpayers and elected officials who are questioning how much of the touted progress is real.  It is the most prominent surge of such skepticism since 2008, when Rhee appeared on the cover of Time magazine with a broom to sweep away the old culture of failure and low expectations.”  Many are now questioning the wisdom of mayoral control of schools, a system that lacks the checks and balances provided by an elected school board.

New D.C. School Cheating Scandal: This Time It’s About Graduating Students Who Didn’t Do the Work

Last November, right after Thanksgiving, National Public Radio and WAMU in Washington, D.C. exposed a scandal at the District’s Ballou High School.  Last May the school had made headlines for graduating all of its seniors and getting every one admitted to college.  You would think we’d have caught on about such promised miracles by now, but apparently we are a gullible society when we want to believe.

Here is what WAMU reported: “An investigation by WAMU and NPR has found that Ballou High School’s administration graduated dozens of students despite high rates of unexcused absences.  WAMU and NPR reviewed hundreds of pages of Ballou’s attendance records, class rosters and emails after a DCPS employee shared the private documents.  The documents showed that half of the graduates missed more than three months of school last year, unexcused. One in five students was absent more than present—missing more than 90 days of school… Another internal e-mail obtained by WAMU and NPR from April shows that two months before graduation, only 57 students were on track to graduate, with dozens of students missing graduation requirements, community service requirements or failing classes needed to graduate. In June, 164 students received diplomas.”

You’ll remember that an earlier Washington, D.C. cheating scandal was exposed during Chancellor Michelle Rhee’s tenure. In March of 2011,  USA Today broke the story about teachers erasing and correcting students’ answers on standardized tests. The problem was never fully investigated because Michelle Rhee controlled the contractor she hired to do the investigation, but John Merrow, the education reporter for the PBS NewsHour eventually confirmed that massive cheating had occurred under Rhee.

While Rhee was never held accountable, the impact on the D.C. public schools is well known—both the long repercussions of Rhee’s leadership style and of the IMPACT plan she instituted for formal teacher evaluations. Despite that Rhee left D.C. in 2012, the IMPACT evaluation plan and promises for rapid school improvement have been maintained by her successors—first Kaya Henderson and now Antwan Wilson.  Last week in the Washington Post, Moriah Balingit, Peter Jamison and Perry Stein reported that Kaya Henderson announced she would raise graduation rates by 22 points in five years, and Wilson, her successor made a similar commitment when he was hired.

In her Washington Post column, Valerie Strauss recently reviewed the history of Rhee’s influence on the D.C. public schools: “On Oct. 28, 2015, the D.C. Public Schools district put out a statement lauding itself with this headline: ‘D.C. Public Schools Continues Momentum as the Fastest Improving Urban School District in the Country.’  For years, that has been the national narrative about the long-troubled school district in the nation’s capital: After decades of low performance and stagnation, the system was moving forward with a ‘reform’ program that was a model for the nation. The triumphant story included rising standardized test scores and ‘miracle’ schools that saw graduation rates jump over the moon in practically no time.  Arne Duncan, President Barack Obama’s education secretary for seven years, called it ‘a pretty remarkable story’ in 2013…  Policymakers and school reformers—in the District and across the nation—chose to believe the ‘miracle’ narrative and ignore warning signs that were there all along… Meanwhile, the graduation rate—nationally and in the District—continued to rise, despite scandals revealing that schools were essentially juicing the books to make it seem like they were graduating more students. Scams included phony ‘credit recovery’ programs, failing to count all students, and, as the District just found out, letting kids graduate without the qualifications required for a diploma.”

Specifically, Strauss comments on the IMPACT teacher evaluation plan instituted by Rhee—and kept in place by Henderson and now Wilson: “The assessment system, known as IMPACT, that was introduced by Rhee… drew serious concerns from teachers and principals, who found it unworkable and unfair, with performance goals that were impossible to meet and metrics that were questionable… The pressure that IMPACT placed on educators and administrators—pressure that led to cheating on tests and phony graduation rates—was never acknowledged, at least until the new scandal.”

After WAMU and NPR exposed problems at Ballou High School, including permitting students to make up for long, unexcused absences by doing an extra project and the school’s instituting slick and insufficient credit-recovery sessions after school, a study of graduation practices was undertaken to determine if what happened at Ballou might be widespread. The Post‘s Perry Stein and Moriah Balingit describe findings of the new report, released on January 29: “Out of 2,758 students who graduated from D.C. public schools last year, more than 900 missed too many classes or improperly took makeup classes.” In a separate story, Stein reports the numbers for particular high schools: “At Anacostia High School in Southeast Washington, nearly 70 percent of the 106 graduates last year received their diplomas despite violating some aspect of city policy—the worst violation rate among comprehensive schools in the city.  At Ballou, the school whose mispractices spurred the investigation, 63 percent of graduates missed more classes than typically allowed, or inappropriately completed credit recovery…. One of the most damning findings came from Dunbar High School in Northwest Washington.  Teacher-centered attendance records at the school were modified from absent to present more than 4000 times for the senior class, which numbered fewer than 200.  Dunbar’s principal, Abdullah Zaki, was removed from the school in the wake of the findings.  Zaki… was named D.C. Public Schools’ principal of the year in 2013….”  The principal and assistant principal at Ballou High School have been fired along with the district’s Chief of Secondary Schools.

It is hard to know exactly how this sad story will end.  The FBI and the U.S. Department of Education’s Office of Inspector General both launched investigations last week.  But while we don’t know the outcome, we don’t have far to look for where the story began.  Once again, Harvard’s Daniel Koretz describes the problem driven almost entirely by faith in rapid school improvement as measured by data—this time using promises of miraculous graduation rate increases instead of rapid test score increases.  Remember that as a measure of school accountability, the 2015 federal Every Student Succeeds Act (the law that replaced No Child Left Behind) requires that states report not only disaggregated test scores on annual standardized tests, but also each secondary school’s graduation rate.

Daniel Koretz clearly explains the impact of trying to drive education policy through pressure to raise scores or graduation rates in his excellent new book, The Testing Charade: Pretending to Make Schools Better: “More than forty years ago, Don Campbell, one of the founders of the science of program evaluation wrote: ‘The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.’  In other words, when you hold people accountable using a numerical measure—vehicle emissions, scores on a test, whatever—two things generally happen: they do things you don’t want them to do, and the measure itself becomes inflated, painting too optimistic a view of whatever it is that the system is designed to improve.” (The Testing Charade, p. 38)

Of course we want more high school students—especially students in places like Washington, D.C.’s poorest neighborhoods—to thrive at school and graduate. High school graduation is a worthy accomplishment.  However, the current practice of pressuring teachers to push students through school to amp up the graduation statistics hurts both the students and the teachers.