In this Teacher Appreciation Week, Fair Pay Would Show Our Teachers They Really Are Appreciated

In 1962, when my mother taught first grade in Havre, Montana, she felt appreciated as a teacher even though the rule was that she had to take the kids outside for recess unless it was below 15 degrees below zero. (Remember that wind chill as a term hadn’t been invented in those days.) She wasn’t paid particularly well, but school did close for an hour at midday, while everybody went home for lunch. She saw her students’ parents all the time in the grocery store, however, and she knew that her opinions and her expertise were valued.

This week has been formally designated as the 2019 Teacher Appreciation Week. But teachers these days aren’t really appreciated. While the Washington Post reports that, merely to sit on Boeing’s board of directors, Caroline Kennedy and Nikki Haley are paid $324,000 annually in cash and stock to attend a day-long meeting every-other-month, school teachers’ salaries haven’t been keeping up at all.

The Economic Policy Institute’s Sylvia Allegretto and Lawrence Mishel just released a report about persistent growth in a teacher wage penalty, which reached an all time high in 2018: “(R)elative teacher wages, as well as total compensation—compared with the wages and total compensation of other college graduates—have been eroding for over half a century.  These trends influence the career choices of college students, biasing them against the teaching profession, and also make it difficult to keep current teachers in the classroom.”

Allegretto and Mishel explain the trend: “(W)omen teachers enjoyed a wage premium in 1960, meaning they were paid more than comparably educated and experienced women workers in other fields. By the early 1980s, the wage premium for women teachers had transformed into a wage penalty… The mid-1990s marks the start of a period of sharply eroding teacher weekly wages and an escalating teacher weekly wage penalty.  Average weekly wages of public school teachers (adjusted for inflation) decreased $21 from 1996-2018, from $1,216 to $1,195 (in 2018 dollars).  In contrast, weekly wages of college graduates rose by $323, from $1,454 to $1,777, over this period.”

And the wage penalty is for both women and men: “The wage premium that women teachers enjoyed in the 1960s and 1970s has long been erased…. Our previous research found that in 1960 women teachers earned 14.7 percent more in weekly wages than comparable women workers… And the wage premium for women teachers gradually faded over the 1980s and 1990s, until it was eventually replaced by a large and growing wage penalty in the 2000s and 2010s.  In 2018, women public school teachers were making 15.1 percent less in wages than comparable women workers.  The wage penalty for men teachers is much larger. The weekly wage penalty for men teachers was 17.8 percent in 1979… In 2018, men teaching public school were making 31.5 percent less in wages than comparable men in other professions.” Overall in 2018, the wage penalty for school teachers reached 21.4 percent.

Teachers benefits, on average, are higher than those of workers in other professions.  Allegretto and Mishel explain: “As a result of their growing benefit share of compensation, teachers are enjoying a ‘benefits advantage’ over other professionals… However this benefits advantage has not been enough to offset the growing wage penalty… The bottom line is that the teacher (total) compensation penalty grew by 10.2 percentage points from 1993-2018.”

There is not a lot of mystery behind how the teacher pay gap has grown.  Allegretto and Mishel blame a wave of tax cuts across the states for the revenue shortages that have driven down compensation for teachers: “The erosion of teacher weekly wages relative to weekly wages of other college graduates… reflects state policy decisions rather than the result of revenue challenges brought on by the Great Recession. A recent study… found that most of the 25 states that were still spending less for K-12 education in 2016 than before the recession had also enacted tax cuts between 2008 and 2016.  In fact, eight of the 10 states with the largest reductions in education funding since 2008 were states that had reduced their overall ‘tax effort’—meaning through tax cuts or other measures they were collecting less in taxes relative to their capacity to generate tax revenue. These eight states were Alabama, Arizona, Florida, Georgia, Idaho, Kansas, Oklahoma, and Virginia.”

Lots of experts including the Economic Policy Institute and the Learning Policy Institute have been tracking the result of extremely difficult teaching conditions in understaffed schools along with low pay for teachers. They have identified what they call the resulting widespread teacher shortage, particularly a shortage of well prepared and experienced teachers.  And they have emphasized the tragedy of increasing churn in the teaching profession as more and more teachers give up and leave the classroom.

But the teacher-blogger, Peter Greene insists we call what is happening something different: “There is no teacher shortage. There’s a slow-motion walkout, a one-by-one exodus, a piecemeal rejection of the terms of employment for educators in 2019… ‘We’ve got a teacher shortage,’ assumes… that there just aren’t enough teachers out there in the world…. That’s where teacher shortage talk takes us—to a search for teacher substitutes. Maybe we can just lower the bar. Only require a college degree in anything at all…  A few hundred students with a ‘mentor’ and a computer would be just as good as one of those teachers that we’re short of, anyway, right?”

Greene defines the problem another way: “Teaching has become such unattractive work that few people want to do it.”  And having defined the problem, he believes there are some ways to address it: “‘Offer them more money’… is certainly an Economics 101 answer… But as the #Red forEd walkouts remind us, money isn’t the whole issue.  Respect. Support.  The tools necessary to do a great job.  Autonomy.  Treating people like actual functioning adults  These are all things that would make teaching jobs far more appealing… There are other factors that make the job less attractive. The incessant focus on testing. The constant stream of new policies crafted by people who couldn’t do a teacher’s job for fifteen minutes. The huge workload, including a constant mountainous river of… paperwork…. the moves to deprofessionalize the work.  The national scale drumbeat of criticism and complaint….”

I believe the collapse in respect for teachers has also been driven by the strategy of the No Child Left Behind Act, which neglected to fund adequate staffing and school improvement and set out to motivate educators with the fear their school would be named “failing” if they could not raise test scores quickly for all children. They were supposed to work harder and smarter. We now know that No Child Left Behind’s demand that all schools could make their students proficient by 2014 didn’t work. Arne Duncan had to waiver states from this requirement to avoid an embarrassing reality: All American schools were going to be branded “failing.”  But today our national education strategy is still driven by the same test-and-punish.

Harvard University’s Daniel Koretz warns us about the dangers of our test-based accountability regime in a 2017 book, The Testing Charade: Pretending to Make Schools Better. Koretz is an expert on the design and use of standardized testing as the basis for evaluating of schools and schoolteachers. He demonstrates how this strategy unfairly brands teachers as failures when they teach in the schools serving our society’s poorest and most vulnerable children: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

We have been watching a yearlong wave of walkouts by teachers—a state-by-state cry for help from a profession of hard-working, dedicated public servants disgusted with despicable working conditions, the lack of desperately needed services for their students, and insultingly low pay. They have showed us what would support them and their students: smaller classes, more counselors and social workers, school nurses, librarians, a generous and enriched curriculum, and salaries adequate enough to pay the rent for a modest apartment, attract new teachers to the profession, and keep experienced teachers.

In this 2019 Teacher Appreciation Week it is a tragedy that so many state legislatures continue to debate further tax cuts. The situation calls to mind the warning of McMaster University education professor of Henry Giroux: “Public schools are at the center of the manufactured breakdown of the fabric of everyday life. They are under attack not because they are failing, but because they are public…”

Advertisements

Politicians Forget that Cut Scores on Standardized Tests Are Not Grounded in Science

Last week the NY TimesDana Goldstein and Manny Fernandez reported on a political fight in Texas over the scoring of the STAAR—the State of Texas Assessments of Academic Readiness—the state’s version of the achievement test each state must still administer every year in grades 3-8 and once in high school.  The federal Every Student Succeeds Act, passed in 2015 to replace No Child Left Behind, still mandates annual testing, although Congress no longer imposes its own high stakes punishments for failure.

However, Congress still does require the states to submit plans to the U.S. Department of Education declaring what will be the consequences for low-scoring schools.  Goldstein and Fernandez explain that Texas, like many other states, still imposes punishments for the low scorers instead of offering help: “The test, the State of Texas Assessments of Academic Readiness, or STAAR, can have profound consequences not just for students but for schools across the state, hundreds of which have been deemed inadequate and are subject to interventions that critics say are undue.”  Schools have to provide help for students who are not on grade level. Also: “Texas grades its districts on an A through F scale, in part based on how many students are meeting or exceeding grade-level standards… Persistently failing schools, and districts with just a single such school, can be shut down or taken over by the state—a threat facing the state’s largest school system, in Houston.”

Decades of research show that, in the aggregate, standardized test scores correlate with family and neighborhood income. In a country where segregation by race and poverty continues to grow, it is now recognized among experts and researchers that rating and ranking schools and districts by their aggregate test scores merely brands the poorest schools as failing. When sanctions are attached, political regimes of test-based accountability merely punish the schools and the teachers and the students in the poorest places.

In an excellent 2017, book, The Testing Charade: Pretending to Make Schools Better, Harvard professor Daniel Koretz explains the correlation of aggregate standardized test scores with family and community economics: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

Goldstein and Fernandez report that the political fight in Texas this month is about the test scores in third grade reading: “The 2018 STAAR tests found that 58 percent of Texas third graders are not reading at grade level. On the 2017 National Assessment of Educational Progress, given to a sample of fourth graders across the country, 72 percent of Texas students were not proficient in reading—a fact the state has cited as evidence that tough local standards are warranted.”

Like many other states, Texas blames the public schools.  But Goldstein and Fernandez present other factors that ought to be considered here: “More than half of the state’s public school students are Hispanic and nearly 60 percent come from low-income families.  About a fifth are still learning English.”  The state argues that’s all the more reason to set the passing cut score high and motivate schools to catch kids up quicker.

But educators and parents and some politicians in Texas are pushing back. They contend that the bar is set so high that students who are reading at grade level still score below the cut score for proficiency.  There is a lot of discussion of reading passages said to be two grade levels ahead of the students being tested and of something called Lexile measures, which involve the number of syllables in a word and are used to evaluate the difficulty of the passages on the test.

It would clear up a lot of the trouble if more people read Chapter 8, “Making Up Unrealistic Targets,” in Daniel Koretz’s book. Koretz explains that there is nothing really scientific about where “proficient” cut scores are set: “If one doesn’t look too closely, reporting what percentage of students are ‘proficient’ seems clear enough. Someone somehow determined what level of achievement we should expect at any given grade—that’s what we will call ‘proficient’—and we’re just counting how many kids have reached that point. This seeming simplicity and clarity is why almost all public discussion of test scores is now cast in terms of the percentage reaching either the proficient standard, or occasionally, another cut score… The trust most people have in performance standards is essential, because the entire educational system now revolves around them. The percentage of kids who reach the standard is the key number determining which teachers and schools will be rewarded or punished.” (The Testing Charade, pl 120)

Koretz explains that standardized test cut scores are not set scientifically. There is no scientific or even magical way of deciding exactly which reading passages every third grader must be able to decode and comprehend, and anyway, students in third grade are not consistent.  Koretz examines several methods used by panels of judges to set the “proficient” level.  He adds that the methods used by different state panels don’t arrive at the same cut scores: “The percentage of kids deemed to be ‘proficient’ sometimes varies dramatically from one method to another.” (The Testing Charade, p. 124)

Goldstein and Fernandez indicate that Texas uses the National Assessment of Education Progress (NAEP) as its audit test by which it judges the accuracy of the way Texas sets its levels of proficiency. When the scores on the STAAR are compared to the scores on the NAEP, politicians in Texas are really concerned because NAEP shows that 72 percent of third graders in Texas are not proficient—even worse than the 58 percent who score below proficient on the STAAR.

But the matter is not as dire as it would appear. The education historian Diane Ravitch served on the National Assessment Governing Board for seven years.  Ravitch explains that the cut scores on the NAEP are set artificially high.  It is much harder to reach the proficient level than what our common understanding of the term “proficient” would lead us to expect: “‘Proficient’ on NAEP does not indicate ‘average’ performance; it is set very high… There are four levels. At the top is ‘advanced.’ Then comes ‘proficient.’ Then ‘basic.’ And last, ‘below basic.’  Advanced is truly superb performance, which is like getting an A+. Among fourth graders, 8% were advanced readers in 2011; 3% of eighth graders were advanced. In reading, these numbers have changed little in the past twenty years…   Proficient is akin to a solid A. In reading, the proportion who were proficient in fourth grade reading rose from 29% in 1992 to 34% in 2011. The proportion proficient in eighth grade also rose from 29% to 34% in those years… Basic is akin to a B or C level performance. Good but not good enough.”

The argument about what different “proficient” levels really mean is old and tired, but we can’t seem to move beyond it. Today we know that the No Child Left Behind Act was aspirational. It was supposed to motivate teachers to work harder to raise scores. Policymakers hoped that if they set the bar really high, teachers would figure out how to get kids over it. It didn’t work.  No Child Left Behind said that all children in American public schools would be proficient by 2014 or their school would be labeled failing. Finally as 2014 loomed closer, Arne Duncan had to give states waivers to avoid what was going to happen if the law had been enforced: All American public schools would have been declared “failing.”

As we continue to haggle about the cut scores by which we judge our children and their schools, however, there is one thing we almost never consider.  What if—instead of punishing the schools where scores are lower and instead of making their children drill harder and attend Saturday cram sessions—we were willing to invest more tax dollars in the lowest scoring schools?  What if we made classes smaller to make it possible for teachers to work more personally with each student?  What if we made sure that the schools in our poorest communities had well stocked libraries with certified librarians and story-hours once or even twice a week?

Koretz comes to this same conclusion, although he explains it more theoretically: “(I)t is clear that the implicit assumption undergirding the reforms is that we can dramatically reduce the variability of achievement… Unfortunately, all evidence indicates that this optimism is unfounded.  We can undoubtedly reduce variations in performance appreciably if we summoned the political will and committed the resources to do so—which would require a lot more than simply imposing requirements that educators reach arbitrary targets for test scores.” (The Testing Charade, p. 131)

U.S. Public Education Is Driven by High-Stakes Testing. Are the Proficiency Cut-Scores Legitimate?

Back in 2005, I worked with members of the National Council of Churches Committee on Public Education and Literacy to develop a short resource, Ten Moral Concerns in the No Child Left Behind Act. While closing achievement gaps seemed an important goal, to us it seemed wrong that—according to an unrelenting year-by-year Adequate Yearly Progress schedule—the law blindly held teachers and schools accountable for raising all children’s test performance to the test score targets set by every state. Children come to school with such a wide range of preparation, and achievement gaps are present when children arrive in Kindergarten.  At that time, we expressed our concern this way:

“Till now the No Child Left Behind Act has neither acknowledged where children start the school year nor celebrated their individual accomplishments. A school where the mean eighth grade math score for any one subgroup grows from a third to a sixth grade level has been labeled a “in need of improvement” (a label of failure) even though the students have made significant progress. The law has not acknowledged that every child is unique and that Adequate Yearly Progress (AYP) thresholds are merely benchmarks set by human beings. Although the Department of Education now permits states to measure student growth, because the technology for tracking individual learning over time is far more complicated than the law’s authors anticipated, too many children will continue to be labeled failures even though they are making strides, and their schools will continue to be labeled failures unless all sub-groups of children are on track to reach reading and math proficiency by 2014.”

Of course today we know that the No Child Left Behind Act was supposed to motivate teachers to work harder to raise scores. Policymakers hoped that if they set the bar really high, teachers would figure out how to get kids over it.  It didn’t work.  No Child Left Behind said that all children would be proficient by 2014 or their school would be labeled failing. Finally as 2014 loomed closer, Arne Duncan had to give states waivers to avoid what was going to happen if the law had been enforced: All American public schools would have been declared “failing.”

Despite the failure of No Child Left Behind,  members of the public, the press, and the politicians across the 50 statehouses who implemented the testing requirements of No Child Left Behind continue to accept the validity of high stakes testing. Politicians, the newspaper reporters and editors who report the scores, and the general public trust the supposed experts who set the cut scores.  That is why states still rank and rate public schools by their test scores and legislators pass laws to punish  low-scoring schools and teachers. That is why on Wednesday this blog commented on Ohio’s plan to expand EdChoice vouchers for students in low-scoring schools and add charters in low-scoring school districts. The list of “failing” schools where students will qualify for vouchers will rise next school year in Ohio from 218 to 475. The list of charter school-eligible districts will grow from 38 to 217.

In response to the continuation of test-and-punish, I’ve been quoting Daniel Koretz’s book, The Testing Charade about the fact that testing cut scores are arbitrary and  punishments unfair:  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do…  Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

As a blogger, I am not an expert on how test score targets—the cut scores—are set, but Daniel Koretz devotes an entire chapter of his book, “Making Up Unrealistic Targets,” to this subject.  Here is how he begins:  “If one doesn’t look too closely, reporting what percentage of students are ‘proficient’ seems clear enough. Someone somehow determined what level of achievement we should expect at any given grade—that’s what we will call ‘proficient’—and we’re just counting how many kids have reached that point. This seeming simplicity and clarity is why almost all public discussion of test scores is now cast in terms of the percentage reaching either the proficient standard, or occasionally, another cut score… The trust most people have in performance standards is essential, because the entire educational system now revolves around them. The percentage of kids who reach the standard is the key number determining which teachers and schools will be rewarded or punished.”  (The Testing Charade, p. 120)

After emphasizing that benchmark scores are not scientifically set and are in fact all arbitrary, Koretz examines some of the methods. The “bookmark” method, he explains, “hinges entirely on people’s guesses about how imaginary students would perform on individual test items… (P)anels of judges are given a written definition of what a standard like “proficient” is supposed to mean.”  Koretz quotes from Nebraska’s definition of reading comprehension: “A student scoring at the Meets the Standards level generally utilizes a variety of reading skills and strategies to comprehend and interpret narrative and informational text at grade level.” After enumerating some of the specific skills and strategies listed in Nebraska, Koretz adds a qualification to the way Nebraska describes its methodology: “A short digression: the emphasized word generally is very important. One of the problems in setting standards is that students are inconsistent in their performance.” (The Testing Charade, pp. 121-122) (Emphasis in the original.)

Koretz continues: “There is another, perhaps even more important, reason why performance standards can’t be trusted: there are many different methods one can use, and there is rarely a really persuasive reason to select one over the other. For example, another common approach, the Angoff method… is like the bookmark in requiring panelists to imagine marginally proficient students, but in this approach they are not given the order of difficulty of the items or a response probability. Instead panelists have to guess the percentage of imaginary marginally proficient students who would correctly answer every item in the test. Other methods entail examining and rating actual student work, rather than guessing the performance of imaginary students on individual items.  Yet other methods hinge on predictions of later performance—for example, in college. There are yet others. This wouldn’t matter if these different methods gave you at least roughly similar results, but they often don’t.  The percentage of kids deemed to be ‘proficient’ sometimes varies dramatically from one method to another.  This inconsistency was copiously documented almost thirty years ago, and the news hasn’t gotten any better.” (The Testing Charade, pp.123-124)

Koretz continues his warning: “However, setting the standards themselves is just the beginning. What gives the performance standards real bite is their translation into conrcete targets for educators, which depends on more than the rigor of the standard itself.  We have to say just who has to reach the threshold. We have to say how quickly performance has to increase—not only overall but for different types of kids and schools. A less obvious but equally important question is how much variation in performance is acceptable… A sensible way to set targets would be to look for evidence suggesting how rapidly teachers can raise achievement by legitimate means—that is, by improving instruction, not by using bad test prep, gaming the system, or simply cheating…  However, the targets in our test-based accountability systems have often required unremitting improvements, year after year, many times as large as any large-scale change we have seen.” (The Testing Charade, pp. 125-126)

Koretz concludes: “(I)t is clear that the implicit assumption undergirding the reforms is that we can dramatically reduce the variability of achievement… Unfortunately, all evidence indicates that this optimism is unfounded.  We can undoubtedly reduce variations in performance appreciably if we summoned the political will and committed the resources to do so—which would require a lot more than simply imposing requirements that educators reach arbitrary targets for test scores.” (The Testing Charade, p. 131)

Decades of Academic Research Support Community Schools Strategy in New York City’s Renewal Schools

So-called “corporate” school reform has been defined by setting standards and testing students to see if they have met the standards.  Rewards and punishments follow for the teachers and schools said to have produced these results. The assumption has been that a school is a closed box that can turn around the lives of the enrolled students—all apart from the fact that students spend only six or seven hours of the day at school. Corporate school reformers said they would disrupt the stasis they thought defined bureaucratic public schools by offering rewards and punishments to motivate teachers to work harder and smarter. Many of these so-called education reformers came from the business schools and employed competition as their primary motivator. And the politicians who followed their advice brought us test score targets to be met and a promise quickly to make every child a winner.

We were warned in advance that this wouldn’t work as we planned.  Dr. James Comer at the Yale School Development Program created a multifaceted program to help schools support the most vulnerable children and to engage educators, parents and the community in this process of building trust and strong relationships.  In 1997, in his book Waiting for a Miracle, Comer described the results. While his staff and outside evaluators believed that the Comer schools had made important progress in improving the children’s education, Comer wrote: “Our best approximation suggests that after three years about a third of the schools make significant social and academic improvement, a third show a modest improvement which is often difficult to sustain, and a third show no gain.” (Waiting for a Miracle, p. 72) The Comer program suggested that seven years was a more realistic timeline to look for real school improvement.

One of the most artificial aspects of corporate school reform was the setting of achievement test targets and short timelines as a motivator.  No Child Left Behind established that all American children in public schools would be proficient by 2014 or their schools and teachers would be punished. As we moved closer to 2014, everybody began to realize that making all schools produce high scores wasn’t working.  When it became apparent that almost all American schools would fall behind in raising what was called each student’s Adequate Yearly Progress, Arne Duncan, then Secretary of Education, began issuing No Child Left Behind Waivers to states which would promise to meet his particular school reform priorities in exchange for his willingness not to declare that state’s schools “failing.”

Slowly it began to be admitted that students’ lives outside school affect their test scores, and that schools alone cannot solve the serious challenges resulting from concentrated poverty.  In 2012, Diane Ravitch described achievement gaps as a complex challenge in children’s lives—not merely the result of the quality of a particular school: “Such gaps exist wherever there is inequality, not only in this country, but internationally.  In every country, the students from the most advantaged families have higher test scores on average than students from the least advantaged families.” (Reign of Error, p. 57)

Last year, the Harvard University testing expert, Daniel Koretz described the problems of demanding ever-rising test scores from every school on the same prescribed timeline: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

Here we are in 2019, when many educators have realized that something has to be done at school to address the needs of children living in communities where poverty is concentrated. A broad-based movement to make schools a social service and healthcare center for families and to add preschool and after school and summer programs at school has emerged.  These are called Community Schools. Here is how the Children’s Aid Society in New York City defines a Community School: “The foundations for community schools can be conceptualized as a Developmental Triangle that places children at the center, surrounded by families and communities.  Because students’ educational success, health and well-being are the focus of every community school, the legs of the triangle consist of three interconnected support systems: A strong core instructional program… expanded learning opportunities… and a full range of health, mental health and social services designed to promote children’s well-being and remove barriers to learning.” (Building Community Schools: A Guide for Action, p. 1)

This week the Washington Post‘s Valerie Strauss published a new piece by the National Education Policy Center’s Kevin Welner and Julia Daniel pleading with the New York City Schools not to give up on NYC’s 2014 expansion of Community Schools. When he made Community Schools the centerpiece of his Renewal Program for the city’s struggling schools, Mayor Bill de Blasio suggested he would improve the schools rather than following his predecessor Michael Bloomberg’s strategy of shutting down such schools.

But lately De Blasio is being criticized because the school turnarounds have not been quick enough.  In October, Eliza Shapiro, writing for the NY Times, suggested, “New York knew some schools in its $773 million plan were doomed. They kept children in them anyway.” The New York Schools Chancellor, Richard Carranza, responded by affirming  De Blasio’s original goal: “Four years ago, Mayor Bill de Blasio made a bold—and correct—investment in 94 of New York City’s most underserved schools.  Rather than giving up on these students and schools, the city invested in them… The Renewal graduation rate has climbed from 52 to 66 percent.  Attendance has increased from 84 percent to 89 percent.  Chronic absenteeism has fallen from 47 to 36 percent.  Suspensions have decreased by 54 percent… While we have not yet decided the future of the Renewal initiative, we will never stop investing in the kinds of programs that have allowed us to improve so many schools that would have closed under prior administrations.”

In their new piece, New York City Offers Some Unpleasant Truths about School Improvement, Kevin Welner and Julia Daniel defend Mayor de Blasio’s plan for Community Schools, although they point out that the Renewal School program underestimated the amount of time it takes to build the kind of trust and relationships James Comer wrote about and to address the challenges poverty poses for children: “The Renewal program—which also supports schools in the city’s larger Community Schools Initiative (CSI)—assists schools by increasing supports, training, and resources for students and teachers. The CSI increases family and community engagement and creates collaborative structures and practices…. These approaches—extended learning time, family and community engagement, collaborative leadership, and integrated student supports—are fundamental to community schools models and informed by decades of research showing that out-of-school factors have an overwhelming influence on student outcomes.  In turning to this evidence-based approach, the mayor should be applauded.”

Welner and Daniel recognize that a three year timeline isn’t enough: “Fortunately, with the initial (three-year) results now in, we do see encouraging improvements… Yet as is the case with all major reform efforts, there have also been challenges that must be addressed….  For example, these schools have been hampered by high levels of principal turnover.  Further, a quarter of the initial Renewal schools have been closed for not meeting the program’s ambitious goals.”

The National Education Policy Center’s purpose is to bring the peer-reviewed research of the academy to bear on the policy that shapes public schools.  Welner and Daniel starkly assess the impact of child poverty on school achievement and the optimal ways schools can address these challenges:

“Here, we need to step back and confront an unpleasant truth about school improvement.  A large body of research teaches us that the opportunity gaps that drive achievement gaps are mainly attributable to factors outside our schools: concentrated poverty, discrimination, disinvestment, and racially disparate access to a variety of resources and employment opportunities.

“Research finds that school itself has much less of an impact on student achievement than out-of-school factors such as poverty.  While schools are important—and can certainly be crucial in the lives of some students—policymakers repeatedly overestimate their capacity to overcome the deeply detrimental effects of poverty and racism….

“But students in many of these communities are still rocked by housing insecurity, food insecurity, their parents’ employment insecurity, immigration anxieties, neighborhood violence and safety, and other hassles and dangers that can come with being a low-income person of color in today’s United States.

“We need to acknowledge these two realities—seemingly in tension: (1) that education reforms can be very helpful, if they’re the right ones and if we’re patient and committed; but (2) we as a society are deceiving ourselves if we think we’ll transform educational outcomes without addressing economic inequality.”

Finally, Welner and Daniel recommend that in New York City, “De Blasio should remain committed to the Renewal program—a program based on decades of rigorous research and already showing meaningful benefits for underserved students… When we look across the nation and see other leaders chasing silver bullets, or ignoring educational inequity altogether, we should rejoice that New York and its mayor are engaged in the demanding yet essential work of partnering with communities to address basic needs….”

Faith in High Stakes Testing Fades, Even Among the Corporate School Reformers

After a recent twenty-fifth anniversary conference at the Center on Reinventing Public Education at the University of Washington, Bothell—a Gates funded education-reformer think tank, Chalkbeat‘s Matt Barnum summarized presentations by a number of speakers who demonstrate growing skepticism about the high-stakes, standardized testing regime that has dominated American public education for over a quarter of a century.

Because the Center on Reinventing Public Education is known as an advocate for portfolio school reform and corporate accountability, you might expect adherence to the dogma of test-and-punish, but, notes Barnum:  “The pervasiveness of the complaints about testing was striking, given that many education reform advocates have long championed using test scores to measure schools and teachers and then to push them to improve.”

Then at a Massachusetts Institute of Technology School Access and Quality Summit early this month, Paymon Rouhanifard presented a major policy address challenging the use of high stakes testing to rank and rate public schools.  Rouhanifard was until very recently Chris Christy’s appointed, school-reformer superintendent in Camden, New Jersey.  Formerly he was the director in New York City of Joel Klein’s Office of Portfolio Management.  Rouhanifard describes the belief system he brought with him to Camden and describes how his five-year tenure as Camden’s superintendent transformed his thinking: “Our belief was that politics and bureaucracy had inhibited the progress Camden students and families deserved to overcome the steep challenges the city was facing…  We believed it was important for the district to segue out of being a highly political monopoly operator of schools….  This is a story about an evolution of my own thinking during that five-year experience…. What I’m referring to are the math and literacy student achievement data we utilize to drive so many of the critical decisions we make… My realization a few years ago was that I rarely asked questions about what these tests actually told us.  What they didn’t tell us.  And perhaps most importantly, what were the specific behaviors they incentivized, and what were the general trade-offs when we acutely focus on how students do on state tests.”

In 2013, at the beginning of his tenure, Rouhanifard introduced a school report card that rated each school primarily by students’ standardized test scores. Two years ago Rouhanifard eliminated his own school report cards.  He describes his realization: “We are spending an inordinate amount of time on formative and interim assessments and test prep, because those are the behaviors we have incentivized.  We are deprioritizing the sciences, the arts, and civic education…. I… believe the drawbacks currently outweigh the benefits.  That we haven’t been honest about the trade-offs.”

Shael Polakow-Suransky, like Rouhanifard, held a position in Joel Klein’s “reformer” school administration in New York City.  Now the president of Bank Street College of Education, he was formerly Klein’s former deputy schools chancellor. Barnum explains that Polakow-Suransky has become an emphatic critic of the nation’s high-stakes standardized testing regime: “The biggest barrier to student learning and closing the achievement gap is the current system of standardized tests.”

In a piece at The74, the  Thomas Fordham Institute’s Robert Pondiscio quotes Polakow-Suransky: “All of us were well-intentioned in pushing this agenda, but the tools we developed were not effective in raising the bar on a wide scale.”

While the Thomas Fordham Institute has endorsed corporate school reform including high-stakes, test-based accountability, Fordham’s Pondiscio now acknowledges that under the Every Student Succeeds Act, U.S. public schools have become mired in an education culture defined by test-based accountability.  Though he seems unclear on the way forward, Pondiscio now advocates for serious reconsideration: “The challenge is not testing vs. not testing.  It’s not accountability vs. none.  Both bring benefits of different kinds, and both are required by a federal law that’s not going to change anytime soon.  The challenge is to develop a policy vision that supports—not thwarts—the classroom practices and long-term student outcomes we seek… The problem is the reductive culture of testing, which has come to shape and define American education, particularly in the kinds of schools attended by our most disadvantaged children.”

There are some who remain faithful to the school reformer dogma. The Center on Reinventing Public Education’s Robin Lake tries to change the subject: “We need a more productive debate about school accountability, not tired arguments over testing.” And Matt Barnum quotes Sandy Kress—still a tried-and-true believer in the No Child Left Behind regime he helped create: “Research shows clearly that accountability made a real difference in this country in narrowing the achievement gap and lifting student achievement.”

Of course, research does not clearly show that Sandy Kress’s kind of No Child Left Behind accountability made a real difference.  Here is Harvard’s Daniel Koretz, in the authoritative book he published a year ago, The Testing Charade: Pretending to Make Schools Better.  It is perhaps this volume by an academic expert on testing that has helped change the minds of some of the corporate school reformers quoted above.  Koretz writes: “It is no exaggeration to say that the costs of test-based accountability have been huge.  Instruction has been corrupted on a broad scale.  Large amounts of instructional time are now siphoned off into test-prep activities that at best waste time and at worst defraud students and their parents.  Cheating has become widespread.  The public has been deceived into thinking that achievement has dramatically improved and that achievement gaps have narrowed.  Many students are subjected to severe stress, not only during testing but also for long periods leading up to it.  Educators have been evaluated in misleading and in some cases utterly absurd ways  Careers have been disrupted and in some cases ended.  Educators, including prominent administrators, have been indicted and even imprisoned.  The primary benefit we received in return for all of this was substantial gains in elementary-school math that don’t persist until graduation.  This is true despite the many variants of test-based accountability the reformers have tried, and there is nothing on the horizon now that suggests that the net effects will be better in the future. On balance, then, the reforms have been a failure.” (The Testing Charade, pp. 191-192)

Introducing readers to Don Campbell, “one of the founders of the science of program evaluation,” Koretz defines the problems inherent in our society’s quarter century of high-stakes, test-and-punish school accountability by quoting Campbell’s Law:  “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intend to monitor.”  Campbell directly addresses the problem of high stakes testing to rank and rate schools:  “Achievement tests may well be valuable indicators of … achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.” (The Testing Charade, pp. 38-39)

How has the testing regime operated perversely to undermine the schools serving our society’s most vulnerable children—the ones we were told No Child Left Behind would catch up academically if only we created incentives and punishments to motivate their teachers to work harder?  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools.  The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others.  Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do.  This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’  It was a deliberate and prominent part of may of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic  The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

Besides imposing unreasonable and damaging punishments on the schools and teachers serving our society’s poorest children, Koretz believes our commitment to a regime of punitive testing has distracted our society from developing the commitment to address the real needs of children and schools in places where poverty is concentrated: “We can undoubtedly reduce variations in performance appreciably, if we summoned the political will and committed the resources to do so—which would require a lot more than simply imposing requirements that educators reach arbitrary targets for test scores.” The Testing Charade, p. 131)

Schools Serving Very Poor Children Need Financial Assistance. Instead Ohio Beats Them Up.

Ohio operates a test-and-punish accountability scheme that ranks and rates schools and school districts, and punishes school districts whose scores are low.  All the while, the state has diminished its effort to support public education and equalize funding.

In mid-September, for example, the state released school report cards awarding schools and school districts letter grades—“A” through “F.”  Like two other districts recently taken over by the state after receiving a series of “F” grades, East Cleveland will be seized by the state and assigned a state-appointed overseer CEO to replace its school superintendent and an appointed commission to replace the local school board.  East Cleveland—an economically and racially segregated inner-ring Cleveland suburban school district—is among Ohio’s very poorest.  Historically the residents in the community have voted high millage relative to their incomes to pay for their public schools despite the closure of local industry and the collapse of the economy.  The school districts in two other impoverished communities, Youngstown and Lorain, were taken over in recent years without a subsequent rise in test scores, the state’s chosen metric. Both received “F” grades again this year. The implementation of state takeover has been insensitive and insulting. Ohio’s Plunderbund reported in March that Krish Mohip, the state overseer CEO in Youngstown, feels he cannot safely move his family to the community where he is in charge of the public schools. He has also been openly interviewing for other jobs. Lorain’s CEO, David Hardy tried to donate the amount of what would be the property taxes on a Lorain house to the school district, when he announced that he does not intend to bring his family to live in Lorain.

EdChoice vouchers are a second high stakes punishment in the school attendance zones of “F”-rated schools. EdChoice gives families the opportunity to opt their children out of “failing” public schools by granting their children a chance to leave at public expense.  Writing for the Heights Observer, Susan Kaeser describes how this works in another Cleveland inner-ring suburban school district: “Access to EdChoice vouchers is tied to Ohio’s deeply flawed education accountability system.  If the aggregate test score data for an individual public school falls short, the school is defined as an EdChoice school.  Anyone residing in the attendance area of that school who could have attended that school is eligible for an EdChoice voucher… Nearly every district that has EdChoice designation serves many high-need students.”

Most students using EdChoice vouchers in the Cleveland Heights-University Heights School District which Kaeser describes are attending religious schools, and in fact real estate companies have been marketing houses in the state-designated neighborhoods as qualifying for EdChoice vouchers. Children can qualify for one of these vouchers as Kindergartners, without ever attending or intending to enroll in the public school that anchors the neighborhood. As Kaeser explains, “Once a student receives a voucher it can be renewed until the student graduates… Voucher use has grown exponentially as more schools were designated EdChoice and as recipients renew their vouchers.  This year, 176 Kindergarten students received first-time vouchers (without previously enrolling in a public school), adding to the total of more than 650 recipients.  The expected loss to the CH-UH district this year from EdChoice is $3.7 million….”  The rapid expansion of this program is fiscally unsustainable.

In a paywalled, September 14, 2018, On The Money report, a legislative update from the Hannah News Service, the Ohio Education Policy Institute school finance expert, Howard Fleeter tracks the impact statewide of Ohio’s EdChoice vouchers. Over the ten years since the program’s inception, it has grown from 3,100 to 22,153 students.  Fleeter explains: “EdChoice vouchers are worth up to $4,650 for students in grades K-8 and up to $6,000 for students in grades 9-12.”  He continues, explaining that while the money ostensibly comes from the state, EdChoice is “funded through a ‘district deduction’ system… The deduction system means that the voucher student is counted in the district of residence’s Formula ADM (Average Daily Membership) and then the voucher is paid for by deducting the voucher amount from the district’s state aid.  This can often result in a district seeing a deduction for the voucher greater than the state aid that was received for that student, meaning that the district is in effect subsidizing the voucher program.”  While in FY 2007, $10,368,839 was spent statewide for EdChoice vouchers.  By FY 2017, the amount statewide had climbed to $102,688,259.  Over the decade, a total of $649,158,483 of state and local tax dollars was diverted from public schools to private school tuition through EdChoice vouchers.

All of Ohio’s school districts where students qualify for EdChoice vouchers are districts serving very poor children. And yet, last month in a new report Howard Fleeter explains: “(R)esidential taxpayers in the low wealth districts are paying taxes at nearly the same rate as are their higher wealth counterparts… The Tax Effort measure shows that when ability to pay is taken into account, the low wealth districts are levying taxes at the highest rate relative to their income, while the highest wealth districts are levying taxes at the lowest rate relative to income.”  Fleeter continues: “(T)he lowest wealth… districts have seen their share of total state and local resources fall from 26.4% in FY99 to 23.1% in FY19, while the highest wealth… school districts have seen their share of total state and local resources increase from 22.2% in FY99 to 23.4% in FY19.  Unsurprisingly… a variety of equity measures indicate that equity in state and local school operating revenues improved from FY99 to FY 09, but regressed somewhat from FY09 to FY19.”

When he was interviewed by Jim Siegel for the Columbus Dispatch, Fleeter was less technical and more candid about the state’s school funding formula: “The formula itself is kind of just spraying money in a not-very-targeted way.”

Siegel reminds readers about the impact of the 2008 Great Recession, compounded by state tax cuts promoted by Governor John Kasich and passed by the legislature: “GOP leaders… eliminated the tangible personal property tax, which more than a decade ago generated about $1.1 billion per year for schools.  For a time, state officials reimbursed schools for those losses, but that has largely been phased out… And finally, there are Gov. John Kasich’s funding formula and fiscal priorities, including income-tax cuts that have meant an estimated $3 billion less in available revenue each year… Kasich crafted a new formula designed to drive funding to districts with the least ability to raise their own local funds, but Fleeter and public education officials have argued that it doesn’t quite work properly.”

Through various schemes to privatize education—EdChoice and several other voucher programs along with a large charter school sector—Governor Kasich and the Republican legislature have found another method, in addition to the flawed school funding formula, to divert needed state dollars out of public schools across the state.  State takeovers of struggling school districts and EdChoice vouchers are the clearest examples in state policy of punitive, top down programs that blame and punish local educators in poor communities instead of driving resources and support to communities serving concentrations of children in poverty.

Once again, it is appropriate to quote Harvard’s Daniel Koretz explaining in The Testing Charade just how high stakes, test-based accountability blames and punishes schools that face the overwhelming challenge of student poverty:  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)

Rick Hess’s Mistake: Failure of Test-and-Punish Is Not Limited to a Few Districts That Have Disappointed

Frederick M. Hess, the director of education policy studies at the American Enterprise Institute, has always been a corporate education reform kind of guy. That is why Hess’s honest analysis this week of the ultimate fraud of a succession of school district miracles—Washington, D.C.’s test score and graduation rate miracle under Michelle Rhee and those who followed her, Alonzo Crim’s Atlanta in the 1980s, Houston’s Texas Miracle under Rod Paige, Arne Duncan’s Chicago, and Beverly Hall’s Atlanta—is so refreshingly candid.

In all of these cases, as Hess points out, there was “a remarkable dearth of attention paid to ensuring that the metrics (were) actually valid and reliable.”  Second, it was “tempting for civic leaders and national advocates to accept happy success stories at face value—especially when they (were) fronted by a charismatic superintendent.” And finally “reformers and reporters (made) things worse with their lust for ‘celebrity superintendents’ and ‘model systems.’ Their fascination nurtur(ed) an echo chamber in which a handful of leaders (got) exalted, often for too-good-to-be-true results.”

One must give Hess credit for honestly admitting the failure of so much of what his own kind of school reformers have been exalting for the past quarter century—business school accountability for schools, driven by universal standardized testing, and evaluated by two primary outcomes—standardized test scores and graduation rates. But Hess makes a mistake when he attributes the problem to a few “model” school districts that have disappointed.

Hess’s explanation is inadequate.  Inadequate because the system itself—the whole idea of school reform based on high stakes testing—cannot work.  Daniel Koretz, the Harvard specialist on testing, tells us why in a recent book: The Testing Charade: Pretending to Make Schools Better.

Koretz defines the problem with high-stakes-test-based school accountability by exploring a primary principle of social science research. Forty years ago, Don Campbell, “one of the founders of the science of program evaluation,” articulated a core principle now known as “Campbell’s Law”: “The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (p. 38)

How does Campbell’s Law describe the dilemma Frederick Hess identifies?  Koretz quotes Don Campbell himself describing the distortion that will follow when high stakes consequences are attached to a school district’s capacity to raise its aggregate test scores: “Achievement tests may well be valuable indicators of… achievement under conditions of normal teaching aimed at general competence.  But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.” (p. 39)

In The Testing Charade, Koretz provides extensive evidence about all the ways high stakes tied to test scores have triggered Campbell’s Law—to invalidate the test results themselves and to undermine our education system and the experiences of teachers and students trapped by No Child Left Behind and the Every Student Succeeds Act in a scheme to raise test scores at all costs.

One consequence is score inflation: “All that is required for scores to become inflated is that the sampling used to create a test has to be predictable… For inflation to occur, teachers or students need to capitalize on this predictability, focusing on the specifics of the test at the expense of the larger domain.” (p. 62)  We read about all the ways curriculum designers and teachers are incentivized to focus their classes on the specific elements of any particular academic discipline that have appeared on previous tests.

A second consequence, related to the first, is flat-out test-prep. Test prep narrows what is taught to students to the material that is tested and drills students about using clues in the test itself to come up with the right answers. Koretz identifies three kinds of bad test prep. Reallocation between subjects has been common when schools emphasize No Child Left Behind’s tested subjects—reading and math—and cut back on social studies, the arts, music and recess. Reallocation within subjects is when schools study past years’ versions of the state tests and ask teachers to focus on particular aspects of a subject.  Finally there is coaching. Schools and test-prep companies teach students to respond in a formulaic way to the format of the questions themselves. Koretz explains why all this has implications for educational equity: “Inappropriate test preparation, like score inflation, is more severe in some places than in others. Teachers of high-achieving students have less reason to indulge in bad preparation for high-stakes tests because the majority of their students will score adequately without it—in particular, above the ‘proficient’ cut score that counts for accountability purposes. So one would expect that test preparation would be a more severe problem in schools serving high concentrations of disadvantaged students…. Once again, disadvantaged kids are getting the short end of the stick.” (pp. 116-117)

And a third consequence, demonstrated in every one of Frederick Hess’s examples is cheating. Koretz examines the biggest cheating scandals, notably Atlanta, Philadelphia, and Washington, DC.  He notes: “Cheating—by teachers and administrators, not by students—is one of the simplest ways to inflate scores, and if you aren’t caught, it’s the most dependable.” Sometimes teachers or administrators erase and change students answers; sometimes they provide teachers or students with the test items in advance; other times teachers give students the answer during the test.  And finally sometimes schools “scrub” off the enrollment rolls the students who are likely to fail.

Koretz presents the questions around cheating by educators as morally fraught. After all, test scores are not simply a proxy for the quality of a school or a school district:  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)

In a system that, by its very structure, is guaranteed to trigger Campbell’s Law, Koretz wonders about the moral implications of cheating: “Just who is responsible?  Is it just the people who actually carry out the fraud or require it?  Or are those who create the pressures to cheat also culpable, even if not criminally?” (p. 91)

Like Frederick Hess, Daniel Koretz recognizes that although outcomes-based, test-and-punish school accountability has been hyped and celebrated, ultimately this kind of school policy has not improved schools as promised.  Koretz digs deeper, however, to expose that the system itself—not merely its abuse by particular educators in particular school districts—is deeply flawed.

Koretz concludes: “It is no exaggeration to say that the costs of test-based accountability have been huge. Instruction has been corrupted on a broad scale. Large amounts of instructional time are now siphoned off into test-prep activities that at best waste time and at worst defraud students and their parents.  Cheating has become widespread. The public has been deceived into thinking that achievement has dramatically improved and that achievement gaps have narrowed. Many students are subjected to severe stress… The primary benefit we received in return for all of this was substantial gains in elementary-school math that don’t persist until graduation… On balance, then, the reforms have been a failure.” (pp. 191-192)