The Problems of Outcomes-Based School Accountablity

I am so tired of the narrative of “failing” schools—a story which is always accompanied by the story of “failing” teachers and their “failing” students. I find myself trapped in arguments about this subject in places where I don’t want to be talking about it—with good friends and relatives around dinner tables, at parties, during intermissions at concerts.  And even though I know a lot about the topic, I can never really win the argument, because the people with whom I am discussing it have always read about it in the newspapers where the test score comparisons are published.  This narrative has no reference whatsoever to what is happening in particular classrooms or particular schools or school districts. Many people with strong opinions have not been in a public school for decades.

The real subject here, of course, is what education is.  But the conversation instead is always a comparison of test scores as a proxy for the quality of a community and its schools.  One wants to get at the the real meaning and purpose of outcomes-based, test-measured school accountability, but that is hard to do in a casual conversation.  And underneath any conversation about “failing” schools are lots of realities about segregation—by class and also by race.

Research has documented growing economic inequality and segregation by family income. Sean Reardon, a Stanford University sociologist, used a massive data set to document the consequences of widening economic inequality for children’s outcomes at school. Reardon showed that while in 1970, only 15 percent of families lived in neighborhoods classified as affluent or poor, by 2007, 31 percent of families lived in such neighborhoods. By 2007, fewer families across America lived in mixed income communities. Reardon also demonstrated that along with growing residential inequality is a simultaneous jump in an income-inequality school achievement gap. The achievement gap between the children with income in the top ten percent and the children with income in the bottom ten percent, was 30-40 percent wider among children born in 2001 than those born in 1975, and twice as large as the black-white achievement gap.

Then there is segregation by race.  Recently I had occasion to revisit a 2014 article by Richard Rothstein on the long-term effects of racism in our caste society: “Even for low-income families, other groups’ disadvantages—though serious—are not similar to those faced by African Americans. Although the number of high-poverty white communities is growing (many are rural)… poor whites are less likely to live in high poverty neighborhoods than poor blacks.  Nationwide, 7 percent of poor whites live in high-poverty neighborhoods, while 23 percent of poor blacks do so. Patrick Sharkey’s Stuck in Place showed that multigenerational concentrated poverty remains an almost uniquely black phenomenon; white children in poor neighborhoods are likely to live in middle-class neighborhoods as adults, whereas black children in poor neighborhoods are likely to remain in such surroundings as adults.  In other words, poor whites are more likely to be temporarily poor, while poor blacks are more likely to be permanently so…. Certainly, Hispanics suffer discrimination, some of it severe… but the undeniable hardship faced by recent, non-English speaking, unskilled, low-wage immigrants is not equivalent to blacks’ centuries of lower-caste status. The problems are different, and the remedies must also be different….”

Our public schools across America are situated in very different communities—small towns of all sorts, small cities, big cities, poor neighborhoods, rich neighborhoods—schools whose children speak English and other schools where for many children, English is not the primary language. Within all this diversity, however is the reality of segregation by race, and according to Reardon, growing segregation by family income.  In more and more places across America, children live in pockets of extreme poverty or pockets of extreme affluence.   While teachers can work with all the outside-of-school variables the children bring to their classrooms—including intensifying segregation by income, there is much of the experience of each child that schoolteachers cannot control. Children are neither blank slates nor empty vessels into which knowledge can be poured.

On Sunday morning, the subject of “failing” schools and “failing” teachers and “failing” students arrived on my doorstep in Patrick O’Donnell’s Plain Dealer article about what key Ohio legislators believe is dangerous: that too many students graduated from high school this year because of “soft” alternative pathways to graduation.  These alternative pathways were only for the 2018 school year— because educators successfully lobbied that the new graduation tests were so hard that all sorts of young people would be denied graduation.  O’Donnell tells us the educators’ fears were well grounded: “More than a third of this spring’s high school graduates from some urban areas would never have received their diplomas under Ohio’s new graduation requirements, were it not for some temporary and easier ‘pathways’ added to avert a statewide graduation ‘crisis.’ In Akron and Columbus, new test-based requirements would have prevented more than a third of this year’s graduates from marching at ceremonies in caps and gowns. In Cleveland, the impact of the controversial new standards would have been even stronger. The higher expectations would have wiped out diplomas for nearly half of the seniors who received them. Those students instead graduated using special one-time alternate pathways created just for this year to ease the transition to the new standards.”

This is the “failing” schools narrative at work.  If you can find a way to read this without noticing legislators’ indictment of those “failing” schools in Akron and Columbus and especially in Cleveland, Rep. Andy Brenner, Chair of the Ohio House Education Committee, will correct you: “What’s going on that they’re not able to get kids up to being college and career-ready?”

Contrast the understanding of education by outcomes-based education accountability hawks like Andy Brenner with the understanding of learning depicted in the new documentary film about Fred Rogers, Won’t You Be My Neighbor?  Mr. Rogers—influenced by prominent experts in child development like Barry Brazelton and Margaret McFarland—defined education as relating to children, listening to children, and responding to children’s questions and needs and concerns.  For Mr. Rogers, education was not teacher- or school-driven but instead happened in relationship—building a child’s understanding from the foundation within the child. A teacher guides instead of lecturing; a teacher responds instead of driving material into a child’s brain.  A teacher starts where the child is.

Contrast such a developmental understanding of teaching and learning with the model framed by an outcomes-driven reformer intent on pouring in enough testable material to get enough adolescents to pass the tests and produce a career-ready cohort from each high school. The outcomes-based reformer worries about the so-called quality of the diploma; the educator in Mr. Rogers’ mold considers beginning where the child is and helping that child realize her or his promise.

In this year’s very best book on education, Harvard’s Daniel Koretz describes the flaws in outcomes-based school accountability. The title explains the book’s importance for our times: The Testing Charade: Pretending to Make Schools Better.

Koretz is a psychometrician.  While he is neither a child psychologist nor a specialist in child development, Koretz describes the omission of all sorts of essential parts of education, including the kind of teaching Fred Rogers believed was important: “A… critical failure of the reforms is that they left almost no room for human judgment. Teachers are not trusted to evaluate students or each other, principals are not trusted to evaluate teachers, and the judgment of professionals from outside the school has only a limited role. What the reformers trust is ‘objective’ standardized measures…. (T)he focus of reform in the United States has been to rely as much as possible on standardized measures and to minimize human judgment, even though the result was to leave a great deal of what is most important unmeasured—and therefore to give educators no incentive to focus on it.  This is one of the most fundamental flaws of test-based accountability and one of the most significant reasons for its failures.” (The Testing Charade, pp. 34-35)

Koretz explains how outcomes-based education is undermining our very understanding of education—and undermining teaching: “Not only is bad test prep pervasive. It has begun to undermine the very notion of good instruction… One of the rationales given to new teachers for focusing on score gains is that high-stakes tests serve a gatekeeping function, and therefore training kids to do well on tests opens doors for them… Whether raising scores will improve students’ later success… depends on how one raises scores.  Increasing scores by teaching well can increase students’ later success… In the early days of test-based accountability, some observers worried that educators were coming to confuse the test with the curriculum… Some of today’s teacher educators, however, make a virtue of this mistake. They often tell new teachers that tests, rather than standards or a curriculum should define what they teach… Why does this matter so much? To start, it encourages reallocation—that is, focusing instruction on the tested sample rather than the domain or the curriculum that it is supposed to represent… What we want is for students to gain the ability to apply knowledge and skills to problems they actually encounter—not to ensure their proficiency in applying them only to test items that look exactly like the ones they will confront in the main test at the end of the year.”  (The Testing Charade, pp. 112-116)

Finally, Koretz speaks directly to the problem in Ohio, where alternative pathways to high school graduation have been needed to ensure high school graduation for large percentages of students in the state’s poorest cities but where students in affluent suburbs with schools to which the state awards “A+” grades merely sail through the new graduation requirements. Outcomes-based education accountability hawks set benchmarks more easily reached by the privileged, but we blame the schools and teachers in poorer communities—and with high school graduation benchmarks, we penalize the students themselves.

Koretz explains: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores…. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms…. Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (The Testing Charade, pp. 129-130)

Sometimes I think I ought to carry a copy of Koretz’s book in my purse, though I’d be written off as such a bore if I were to pull it out and read from it when somebody at a party begins bragging about their school—rated “A+” by the state of Ohio—while the school across town gets an “F.”  Everybody ought to take Daniel Koretz’s book to read at the beach this summer.

Advertisements

Sorting Out the Debate About Educational Accountability

The watchword for the last quarter century’s school reform has been accountability: holding schools and school teachers accountable for quickly raising students’ scores on standardized tests. Sanctioning schools and teachers who can’t quickly raise scores was supposed to be an effective strategy for overcoming educational injustice. Test-and-punish has enabled us at least to say we’ve been doing something to hold schools accountable.

The politics of this conversation are pretty confusing—all going back to the federal education law, the 2001 No Child Left Behind Act (NCLB), and the debate about its replacement, the 2015 Every Student Succeeds Act (ESSA).  There was bipartisan agreement in 2001-2002 when NCLB was debated, passed, and signed into law that our society could close racial and economic achievement gaps by testing all students and then demanding that schools quickly raise the scores of underachieving students. In 2015 when Congress debated the law’s reauthorization, accountability-hawk Democrats stood by test-and-punish accountability; many Republicans, led by Senator Lamar Alexander instead pushed to expand states’ rights by lifting the heavy hand of the federal government and allowing states to design their own plans to improve so-called failing schools. Worrying that removal of universal testing would let schools off the hook, the Civil Rights Community has stood by NCLB’s testing plan. Many have continued to assume that universal testing exposes achievement gaps and that the exposure will motivate politicians and educators to address racial and economic disparities.

Test-and-punish school reform has been at the center of a conversation between Republican Senator Lamar Alexander, the chair of the Senate Health, Education, Labor and Pensions Committee, and Republican Education Secretary Betsy DeVos.  An article by Caitlin Emma published over the weekend by POLITICO examines the history of No Child Left Behind vs. the Every Student Succeeds Act as a background for looking at how policy around school accountability has been evolving in the Trump administration. Emma describes the new ESSA, passed by a Republican Congress in 2015 and designed to return at least some authority for accountability back to the states. But Democrats prodded by Civil Rights leaders and some Republicans have stood by federally imposed accountability: “Critics… worry whether states will adequately track and provide equal opportunities for at-risk kids…. (Even) former Republican Rep. John Kline… an architect of the measure, has said he’s worried states are now getting away with testing plans that violate a key requirement of the law—that states administer the same test to all students annually.  The provision is critical (Kline believes) so that states are forced to report the performance of all students and the results for poor and minority students are not hidden from view, as they were for decades before federal testing requirements were enacted.”

Emma explains: “The Every Student Succeeds Act, which passed in 2015, was widely viewed by Republicans as a corrective to the federal overreach that followed… No Child Left Behind.”  Emma reports that last summer, when Jason Botel, an official in Betsy DeVos’s Department of Education began reviewing the states’ applications for federal funds under the ESSA, Botel demanded that before he would approve some states’ plans, they must toughen their standards and demand more.  Powerful Republican Senator Lamar Alexander, who had—during the 2015 reauthorization—supported a return of control to the states, formally complained to Betsy DeVos—“furious that a top DeVos aide was circumventing a new law aimed at reducing the federal government’s role in K-12 education. He contended that the agency was out of bounds by challenging state officials, for instance, about whether they were setting sufficiently ambitions goals for their students.”

For many of us who have, for fifteen years, closely followed educational accountability as mandated under No Child Left Behind and the Every Student Succeeds Act, the entire debate seems wrong-headed and bizarre.  I am writing about those of us who care deeply about expanding opportunity for children segregated in schools where poverty is highly concentrated— schools where intense segregation by poverty is overlaid on segregation by ethnicity and race. The schools these children attend have, under federal policy, been derided by accountability hawks as “failing” schools.  Widespread blaming—of schools and school teachers—now dominates discussions of school reform even as sociologists increasingly document that family and neighborhood poverty pose overwhelming challenges for these children and their schools.

Much of the confusion and rancor arises because the public debate about school accountability conflates two very different questions:

  • Should the federal government be involved at all in telling states what to do about education?
  • Is test-and-punish accountability an effective strategy for improving public schools and closing opportunity gaps?

The original federal education law, the 1965 Elementary and Secondary Education Act, addressed the first question as a response to the needs of children in primarily southern states, where schools serving black children had been underfunded and inadequate for generations. There are similar problems of inequity across cities today and forgotten rural areas. Poor children and children of color segregated in particular areas remain under served. The debate about this first question involves states’ rights vs. what has come to be accepted (by many of us) as the federal government’s responsibility to protect the rights of all children and ensure they are all well served. It is a heated question that remains underneath much of the debate about school reform.

The second question involves the strategy Congress chose for reforming schools in the 2001 No Child Left Behind Act. Congress blamed teachers and schools and devised a law that was supposed to force schools and teachers to work harder and faster to improve test scores in schools where achievement lagged when all children in each state were tested on a single standardized test.  It is becoming clearer all the time that when Congress jumped behind test-and-punish accountability, it chose the wrong strategy.  A long and growing body of research demonstrates that test scores are far more aligned with a school’s aggregate economic level than with the work of the teachers or the curriculum being offered to students. Economists like Bruce Baker at Rutgers University also document enormous opportunity gaps as these same public schools in our nation’s poorest communities receive far less public investment than the schools in wealthy suburbs, schools serving children whose families also invest heavily in enrichments at home.

Here is just some of the prominent research from the past ten years that tries to answer the second question.

In 2010, Anthony Bryk and educational sociologists from the Consortium on Chicago School Research at the University of Chicago described the challenges for a particular subset of schools in Chicago, Illinois that exist in a city where many schools serve low income children. The Consortium focused on 46 schools whose students live in neighborhoods where poverty is extremely concentrated.  These “truly disadvantaged” schools are far poorer than the norm. They serve families and neighborhoods where the median family income is $9,480. They are racially segregated, each serving 99 percent African American children, and they serve on average 96 percent poor children, with virtually no middle class children present. The researchers report that in the truly disadvantaged schools, 25 percent of the children have been substantiated by the Department of Children and Family Services as being abused or neglected, either currently or during some earlier point in their elementary career. “This means that in a typical classroom of 30… a teacher might be expected to engage 7 or 8 such students every year.”  “(T)he job of school improvement appears especially demanding in truly disadvantaged urban communities where collective efficacy and church participation may be relatively low, residents have few social contacts outside their neighborhood, and crime rates are high.  It can be equally demanding in schools with relatively high proportions of students living under exceptional circumstances, where the collective human need can easily overwhelm even the strongest of spirits and the best of intentions. Under these extreme conditions, sustaining the necessary efforts to push a school forward on a positive trajectory of change may prove daunting indeed.” (Organizing Schools for Improvement, pp. 172-187)

Then in 2011, Sean Reardon of Stanford University released a massive data analysis confirming the connection of school achievement gaps to growing economic inequality and residential patterns becoming rapidly more segregated by income. Reardon documented that across America’s metropolitan areas the proportion of families living in either very poor or very affluent neighborhoods increased from 15 percent in 1970 to 33 percent by 2009, and the proportion of families living in middle income neighborhoods declined from 65 percent in 1970 to 42 percent in 2009.  Reardon also demonstrated that along with growing residential inequality is a simultaneous jump in an income-inequality school achievement gap among children and adolescents.  The achievement gap between students with income in the top ten percent and students with income in the bottom ten percent is 30-40 percent wider among children born in 2001 than those born in 1975.

In The Testing Charade, a book published just last month, Daniel Koretz of Harvard University blames test-and-punish accountability for enabling our society to pretend that we have been overcoming educational inequity at the same time we avoid making the public investment necessary even to begin addressing the problem: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores…. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms…. Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)  “If we are going to make real headway, we are going to have to confront the simple fact that many teachers will need substantial supports if they are going to markedly improve the performance of their students… And the range of services needed is broad. One can’t expect students’ performance in schools to be unaffected by inadequate nutrition, insufficient health care, home environments that have prepared them poorly for school, or violence on the way to school.” (p. 201)

The second question involves the overall direction of education policy, and it is important because we desperately need a better strategy. Blaming and punishing the schools with the lowest scores—by closing “failing” schools or privatizing them or firing their teachers and principals—has only further undermined the public schools in the poorest neighborhoods of our big cities without addressing the opportunity gaps the tests identify.

Today’s Republican tax slashing agenda will only further reduce public investment in education.  And we are likely to keep on blaming the victims.

Daniel Koretz: More Detail from “The Testing Charade” on Cheating Scandal in Atlanta

Back in 2015, I watched when part of the trial of the Atlanta school teachers—accused of erasing and correcting their students’ test scores—was televised on C-Span (see here and here). And two weeks ago I read Daniel Koretz’s new book, The Testing Charade, a book about what happens when high stakes punishments are attached to any social indicator. I read Koretz’s book pretty much without emotion or judgment—as an academic exercise to understand his argument against the high stakes that policy makers have used as a threat to drive teachers to work harder and raise test scores faster. I didn’t focus on the sections about the cheating scandals.  After all, I imagined, the scandals have just become a part of history.

Then on Wednesday evening, I watched Lisa Stark’s report for the PBS NewsHour about the 9 Atlanta school teachers and principals who are appealing their criminal convictions to clear their names and avoid stints in prison for participating in what is said to have been a 44-school cheating scandal driven by Superintendent Beverly Hall, who won awards when test scores rose miraculously quickly in Atlanta’s schools. Hall died before her own involvement could be adjudicated.

Daniel Koretz, the Harvard professor whose new book explores the Atlanta cheating scandal (among cheating scandals in Washington, D.C, Pennsylvania and many other places) as among the widespread consequences of our test-and-punish regime of school reform, spoke briefly in Lisa Stark’s report. In his book he attributes the problem to what social scientists call Campbell’s Law. Here is Koretz’s definition: “The more any quantitative social indicator is used for social decision making the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (p. 38)

Koretz explores the issue far more deeply in his new book than he did in Wednesday night’s short clip for the NewsHour. My feeling two years ago that the Atlanta educators’ criminal convictions were unfair and what, as I watched the PBS report, I recognized as my feeling of relief two weeks ago when I read Koretz’s book—that an expert scholar confirmed my own sense of injustice in Atlanta—sent me back again yesterday to Koretz’s book.  Here is some of what he didn’t have time to say in Wednesday’s report for PBS.

“One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130) Koretz continues: “(T)his decision backfired. The result was, in many cases, unrealistic expectations that teachers simply couldn’t meet by any legitimate means.” (p. 134)

In Atlanta, Koretz describes the situation at Parks Middle School, as it was portrayed by Rachel Aviv in a New Yorker profile of the Atlanta cheating scandal.  Koretz explains: “This is the school where Damany Lewis and Christopher Waller worked. Aviv documented the way in which Waller choreographed an increasingly large and well-organized cheating ring… Why did Lewis and others do this  At least in Lewis’s case, it was not because he was comfortable cheating. Quite the contrary…  Then why? In a nutshell, because their only other choice was to fail—not when compared with reasonable goals but when held to Hall’s and NCLB’s entirely arbitrary targets. Parks is located in a terribly depressed neighborhood. Half the homes are vacant. Students call the neighborhood ‘Jack City’ because of all the armed robberies. Very few of the students come from homes with two parents. Aviv reported that some students came to school in filthy clothing and that Lewis told students to drop dirty laundry in the back of his truck so that he could wash clothes for them. Some of the parents were dysfunctional because of drug use. During the years leading up to the cheating scandal, Parks had made real progress. A new principal renovated the school and worked on both refocusing students on academics and building a sense of community. Using funds that Hall’s administration had obtained, the school implemented after-school and tutoring programs. However, this simply wasn’t enough, given how fast scores had to rise to meet Hall’s demands. Lewis told Aviv that he had pushed his students harder than they had ever been pushed and that he was ‘not willing to let the state slap them in the face and say they’re failures.'” (pp. 77-78)

Besides leaving 9 Atlanta teachers and principals with criminal convictions, what has been the ultimate outcome of all this test-and-punish for society as a whole including our children? “It’s no exaggeration to say that the costs of test-based accountability have been huge. Instruction has been corrupted on a broad scale. Large amounts of instructional time are now siphoned off into test-prep activities that at best waste time and at worst defraud students and their parents. Cheating has become widespread. The public has been deceived into thinking that achievement has dramatically improved and that achievement gaps have narrowed. Many students are subjected to severe stress… Educators have been evaluated in misleading and in some cases utterly absurd ways. Careers have been disrupted and in some cases ended. Educators including prominent administrators, have been indicted and even imprisoned. The primary benefit we received in return for all of this was substantial gains in elementary-school math that don’t persist until graduation.” (p. 191)

Koretz concludes: “Reformers may take umbrage and say that they certainly didn’t demand that teachers cheat. They didn’t, although in fact many policy makers actively encouraged bad test prep that produced fraudulent gins. What they did demand was unrelenting and often very large gains that many teachers couldn’t produce through better instruction, and they left them with inadequate supports as they struggled to meet these often unrealistic targets. They gave many educators the choice… fail, cut corners, or cheat—and many chose not to fail.” (p. 244)