Politicians Forget that Cut Scores on Standardized Tests Are Not Grounded in Science

Last week the NY TimesDana Goldstein and Manny Fernandez reported on a political fight in Texas over the scoring of the STAAR—the State of Texas Assessments of Academic Readiness—the state’s version of the achievement test each state must still administer every year in grades 3-8 and once in high school.  The federal Every Student Succeeds Act, passed in 2015 to replace No Child Left Behind, still mandates annual testing, although Congress no longer imposes its own high stakes punishments for failure.

However, Congress still does require the states to submit plans to the U.S. Department of Education declaring what will be the consequences for low-scoring schools.  Goldstein and Fernandez explain that Texas, like many other states, still imposes punishments for the low scorers instead of offering help: “The test, the State of Texas Assessments of Academic Readiness, or STAAR, can have profound consequences not just for students but for schools across the state, hundreds of which have been deemed inadequate and are subject to interventions that critics say are undue.”  Schools have to provide help for students who are not on grade level. Also: “Texas grades its districts on an A through F scale, in part based on how many students are meeting or exceeding grade-level standards… Persistently failing schools, and districts with just a single such school, can be shut down or taken over by the state—a threat facing the state’s largest school system, in Houston.”

Decades of research show that, in the aggregate, standardized test scores correlate with family and neighborhood income. In a country where segregation by race and poverty continues to grow, it is now recognized among experts and researchers that rating and ranking schools and districts by their aggregate test scores merely brands the poorest schools as failing. When sanctions are attached, political regimes of test-based accountability merely punish the schools and the teachers and the students in the poorest places.

In an excellent 2017, book, The Testing Charade: Pretending to Make Schools Better, Harvard professor Daniel Koretz explains the correlation of aggregate standardized test scores with family and community economics: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.”  (The Testing Charade, pp. 129-130)

Goldstein and Fernandez report that the political fight in Texas this month is about the test scores in third grade reading: “The 2018 STAAR tests found that 58 percent of Texas third graders are not reading at grade level. On the 2017 National Assessment of Educational Progress, given to a sample of fourth graders across the country, 72 percent of Texas students were not proficient in reading—a fact the state has cited as evidence that tough local standards are warranted.”

Like many other states, Texas blames the public schools.  But Goldstein and Fernandez present other factors that ought to be considered here: “More than half of the state’s public school students are Hispanic and nearly 60 percent come from low-income families.  About a fifth are still learning English.”  The state argues that’s all the more reason to set the passing cut score high and motivate schools to catch kids up quicker.

But educators and parents and some politicians in Texas are pushing back. They contend that the bar is set so high that students who are reading at grade level still score below the cut score for proficiency.  There is a lot of discussion of reading passages said to be two grade levels ahead of the students being tested and of something called Lexile measures, which involve the number of syllables in a word and are used to evaluate the difficulty of the passages on the test.

It would clear up a lot of the trouble if more people read Chapter 8, “Making Up Unrealistic Targets,” in Daniel Koretz’s book. Koretz explains that there is nothing really scientific about where “proficient” cut scores are set: “If one doesn’t look too closely, reporting what percentage of students are ‘proficient’ seems clear enough. Someone somehow determined what level of achievement we should expect at any given grade—that’s what we will call ‘proficient’—and we’re just counting how many kids have reached that point. This seeming simplicity and clarity is why almost all public discussion of test scores is now cast in terms of the percentage reaching either the proficient standard, or occasionally, another cut score… The trust most people have in performance standards is essential, because the entire educational system now revolves around them. The percentage of kids who reach the standard is the key number determining which teachers and schools will be rewarded or punished.” (The Testing Charade, pl 120)

Koretz explains that standardized test cut scores are not set scientifically. There is no scientific or even magical way of deciding exactly which reading passages every third grader must be able to decode and comprehend, and anyway, students in third grade are not consistent.  Koretz examines several methods used by panels of judges to set the “proficient” level.  He adds that the methods used by different state panels don’t arrive at the same cut scores: “The percentage of kids deemed to be ‘proficient’ sometimes varies dramatically from one method to another.” (The Testing Charade, p. 124)

Goldstein and Fernandez indicate that Texas uses the National Assessment of Education Progress (NAEP) as its audit test by which it judges the accuracy of the way Texas sets its levels of proficiency. When the scores on the STAAR are compared to the scores on the NAEP, politicians in Texas are really concerned because NAEP shows that 72 percent of third graders in Texas are not proficient—even worse than the 58 percent who score below proficient on the STAAR.

But the matter is not as dire as it would appear. The education historian Diane Ravitch served on the National Assessment Governing Board for seven years.  Ravitch explains that the cut scores on the NAEP are set artificially high.  It is much harder to reach the proficient level than what our common understanding of the term “proficient” would lead us to expect: “‘Proficient’ on NAEP does not indicate ‘average’ performance; it is set very high… There are four levels. At the top is ‘advanced.’ Then comes ‘proficient.’ Then ‘basic.’ And last, ‘below basic.’  Advanced is truly superb performance, which is like getting an A+. Among fourth graders, 8% were advanced readers in 2011; 3% of eighth graders were advanced. In reading, these numbers have changed little in the past twenty years…   Proficient is akin to a solid A. In reading, the proportion who were proficient in fourth grade reading rose from 29% in 1992 to 34% in 2011. The proportion proficient in eighth grade also rose from 29% to 34% in those years… Basic is akin to a B or C level performance. Good but not good enough.”

The argument about what different “proficient” levels really mean is old and tired, but we can’t seem to move beyond it. Today we know that the No Child Left Behind Act was aspirational. It was supposed to motivate teachers to work harder to raise scores. Policymakers hoped that if they set the bar really high, teachers would figure out how to get kids over it. It didn’t work.  No Child Left Behind said that all children in American public schools would be proficient by 2014 or their school would be labeled failing. Finally as 2014 loomed closer, Arne Duncan had to give states waivers to avoid what was going to happen if the law had been enforced: All American public schools would have been declared “failing.”

As we continue to haggle about the cut scores by which we judge our children and their schools, however, there is one thing we almost never consider.  What if—instead of punishing the schools where scores are lower and instead of making their children drill harder and attend Saturday cram sessions—we were willing to invest more tax dollars in the lowest scoring schools?  What if we made classes smaller to make it possible for teachers to work more personally with each student?  What if we made sure that the schools in our poorest communities had well stocked libraries with certified librarians and story-hours once or even twice a week?

Koretz comes to this same conclusion, although he explains it more theoretically: “(I)t is clear that the implicit assumption undergirding the reforms is that we can dramatically reduce the variability of achievement… Unfortunately, all evidence indicates that this optimism is unfounded.  We can undoubtedly reduce variations in performance appreciably if we summoned the political will and committed the resources to do so—which would require a lot more than simply imposing requirements that educators reach arbitrary targets for test scores.” (The Testing Charade, p. 131)

Advertisements

Schools Serving Very Poor Children Need Financial Assistance. Instead Ohio Beats Them Up.

Ohio operates a test-and-punish accountability scheme that ranks and rates schools and school districts, and punishes school districts whose scores are low.  All the while, the state has diminished its effort to support public education and equalize funding.

In mid-September, for example, the state released school report cards awarding schools and school districts letter grades—“A” through “F.”  Like two other districts recently taken over by the state after receiving a series of “F” grades, East Cleveland will be seized by the state and assigned a state-appointed overseer CEO to replace its school superintendent and an appointed commission to replace the local school board.  East Cleveland—an economically and racially segregated inner-ring Cleveland suburban school district—is among Ohio’s very poorest.  Historically the residents in the community have voted high millage relative to their incomes to pay for their public schools despite the closure of local industry and the collapse of the economy.  The school districts in two other impoverished communities, Youngstown and Lorain, were taken over in recent years without a subsequent rise in test scores, the state’s chosen metric. Both received “F” grades again this year. The implementation of state takeover has been insensitive and insulting. Ohio’s Plunderbund reported in March that Krish Mohip, the state overseer CEO in Youngstown, feels he cannot safely move his family to the community where he is in charge of the public schools. He has also been openly interviewing for other jobs. Lorain’s CEO, David Hardy tried to donate the amount of what would be the property taxes on a Lorain house to the school district, when he announced that he does not intend to bring his family to live in Lorain.

EdChoice vouchers are a second high stakes punishment in the school attendance zones of “F”-rated schools. EdChoice gives families the opportunity to opt their children out of “failing” public schools by granting their children a chance to leave at public expense.  Writing for the Heights Observer, Susan Kaeser describes how this works in another Cleveland inner-ring suburban school district: “Access to EdChoice vouchers is tied to Ohio’s deeply flawed education accountability system.  If the aggregate test score data for an individual public school falls short, the school is defined as an EdChoice school.  Anyone residing in the attendance area of that school who could have attended that school is eligible for an EdChoice voucher… Nearly every district that has EdChoice designation serves many high-need students.”

Most students using EdChoice vouchers in the Cleveland Heights-University Heights School District which Kaeser describes are attending religious schools, and in fact real estate companies have been marketing houses in the state-designated neighborhoods as qualifying for EdChoice vouchers. Children can qualify for one of these vouchers as Kindergartners, without ever attending or intending to enroll in the public school that anchors the neighborhood. As Kaeser explains, “Once a student receives a voucher it can be renewed until the student graduates… Voucher use has grown exponentially as more schools were designated EdChoice and as recipients renew their vouchers.  This year, 176 Kindergarten students received first-time vouchers (without previously enrolling in a public school), adding to the total of more than 650 recipients.  The expected loss to the CH-UH district this year from EdChoice is $3.7 million….”  The rapid expansion of this program is fiscally unsustainable.

In a paywalled, September 14, 2018, On The Money report, a legislative update from the Hannah News Service, the Ohio Education Policy Institute school finance expert, Howard Fleeter tracks the impact statewide of Ohio’s EdChoice vouchers. Over the ten years since the program’s inception, it has grown from 3,100 to 22,153 students.  Fleeter explains: “EdChoice vouchers are worth up to $4,650 for students in grades K-8 and up to $6,000 for students in grades 9-12.”  He continues, explaining that while the money ostensibly comes from the state, EdChoice is “funded through a ‘district deduction’ system… The deduction system means that the voucher student is counted in the district of residence’s Formula ADM (Average Daily Membership) and then the voucher is paid for by deducting the voucher amount from the district’s state aid.  This can often result in a district seeing a deduction for the voucher greater than the state aid that was received for that student, meaning that the district is in effect subsidizing the voucher program.”  While in FY 2007, $10,368,839 was spent statewide for EdChoice vouchers.  By FY 2017, the amount statewide had climbed to $102,688,259.  Over the decade, a total of $649,158,483 of state and local tax dollars was diverted from public schools to private school tuition through EdChoice vouchers.

All of Ohio’s school districts where students qualify for EdChoice vouchers are districts serving very poor children. And yet, last month in a new report Howard Fleeter explains: “(R)esidential taxpayers in the low wealth districts are paying taxes at nearly the same rate as are their higher wealth counterparts… The Tax Effort measure shows that when ability to pay is taken into account, the low wealth districts are levying taxes at the highest rate relative to their income, while the highest wealth districts are levying taxes at the lowest rate relative to income.”  Fleeter continues: “(T)he lowest wealth… districts have seen their share of total state and local resources fall from 26.4% in FY99 to 23.1% in FY19, while the highest wealth… school districts have seen their share of total state and local resources increase from 22.2% in FY99 to 23.4% in FY19.  Unsurprisingly… a variety of equity measures indicate that equity in state and local school operating revenues improved from FY99 to FY 09, but regressed somewhat from FY09 to FY19.”

When he was interviewed by Jim Siegel for the Columbus Dispatch, Fleeter was less technical and more candid about the state’s school funding formula: “The formula itself is kind of just spraying money in a not-very-targeted way.”

Siegel reminds readers about the impact of the 2008 Great Recession, compounded by state tax cuts promoted by Governor John Kasich and passed by the legislature: “GOP leaders… eliminated the tangible personal property tax, which more than a decade ago generated about $1.1 billion per year for schools.  For a time, state officials reimbursed schools for those losses, but that has largely been phased out… And finally, there are Gov. John Kasich’s funding formula and fiscal priorities, including income-tax cuts that have meant an estimated $3 billion less in available revenue each year… Kasich crafted a new formula designed to drive funding to districts with the least ability to raise their own local funds, but Fleeter and public education officials have argued that it doesn’t quite work properly.”

Through various schemes to privatize education—EdChoice and several other voucher programs along with a large charter school sector—Governor Kasich and the Republican legislature have found another method, in addition to the flawed school funding formula, to divert needed state dollars out of public schools across the state.  State takeovers of struggling school districts and EdChoice vouchers are the clearest examples in state policy of punitive, top down programs that blame and punish local educators in poor communities instead of driving resources and support to communities serving concentrations of children in poverty.

Once again, it is appropriate to quote Harvard’s Daniel Koretz explaining in The Testing Charade just how high stakes, test-based accountability blames and punishes schools that face the overwhelming challenge of student poverty:  “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores—and, particularly important in this system, more kids who aren’t ‘proficient’—than others. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms… Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)

Schott Foundation’s New “Loving Cities Index” Rejects Decades-Long, Test-and-Punish School Policy

Here is Dr. John Jackson, President & CEO of the Schott Foundation for Public Education, announcing the Foundation’s new Loving Cities Index: “Considering the social and political moment, the public, private and philanthropic sectors must go beyond the normal separate silos approach to shift from a standards-based agenda where we only analyze shortcomings to a supports-based agenda where we focus on the resources needed for all students to overcome obstacles created by inequity and achieve high outcomes.”

What is our social and political moment that makes Schott’s new initiative so important?

Last month in Parkland, Florida, there was a tragedy—a school shooting in which 17 adolescents and adults were killed by a former student with a semi-automatic rifle. An outpouring of grief has turned the attention of the nation, as it should, to the insanity of the absence of restrictions on the possession of guns.

One cannot compare tragedies, of course, but it is essential that this latest tragedy not totally displace concern about another calamity happening right in front of us, but invisible nonetheless because we choose not to see it.  This one also involves students at school.  Last week, on the 50th anniversary of the release of the Kerner Commission Report, Linda Darling-Hammond and the Learning Policy Institute published a brief reminding us that economic inequality, residential segregation by income and race, and inequitable school funding improved briefly in the decade after the Kerner Report, but began once again to rise after 1980:

  • “U.S. childhood poverty rates have grown by more than 50% since the 1970s and are now by far the highest among OECD nations, reaching 22% in the latest published statistics.”
  • “In most major American cities, a majority of African American and Latino students attend public schools where at least 75% of students are from low-income families… For example, in Chicago and New York City, more than 95% of both Black and Latino students attend majority-poverty schools….”
  • “Today, about half as many Black students attend majority White schools (just over 20%) as did so in 1988, when about 44% did so.”
  • “In most states, the wealthiest (school) districts spend at least two to three times what the poorest districts can spend per pupil…. Furthermore, the wealthiest states spend about three times what the poorer states spend.”

Half a century ago Jonathan Kozol named these same problems that kill children’s spirits and block their opportunities “death at an early age.”  Today these circumstances affect several million young people as our unequal society awards high honors to wealthy suburban high schools for producing National Merit Scholars and brands the schools in our cities’ poorest school districts with “D”s and “F”s on so-called school report cards issued by state governments.

In a must-read book published last fall—The Testing Charade: Pretending to Make Schools Better—Harvard’s Daniel Koretz describes the catastrophic mistake in the test-and-punish school reform that has reigned for the past two decades: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores…. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms…. Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130) Policymakers “acted as if… (schools alone could) largely eliminate variations in student achievement, ignoring the impact of factors that have nothing to do with the behavior of educators—for example, the behavior of parents, students’ health and nutrition, and many characteristics of the communities in which students grow up.” (p. 123-124)

Our society has chosen to blame and punish our poorest schools—shutting them down, moving the students around to other schools, instituting privatization—instead of investing to support the teachers, make classes smaller, enrich curriculum and provide more counselors, along with trying to do something to alleviate poverty itself.  The Schott Foundation’s Loving Cities Index calls for a cross-sector effort to overturn today’s public policy that tests, punishes, and brands schools and teachers and children in our poorest communities.

What makes the Schott Foundation’s Loving Cities Index so important?

The Loving Cities Index Report redefines the problem of children left behind: “(T)wo facts remain true at a systems level: the public school system remains the primary institution of education for over 90% of students in America; and parental income remains the number one predictor of student outcomes—not type of public school, labor contract, or brand of assessment.  For far too long, efforts to improve educational outcomes have focused narrowly on the role of schools, classrooms and teachers, while ignoring the large and growing body of research that confirms what parents and families have long known—at the district level, health, housing, and parental employment opportunities are all intimately linked to high school and college attainment…  (A) Stanford University analysis of reading and math test sores from across the country found that, ‘Children in the school districts with the highest concentrations of poverty score an average of more than four grade levels below children in the richest districts.'”

The Loving Cities Index project encourages bridging services across schools and communities: “(T)he U.S. public school system continues to be our best hub to link families and students to the supports needed to thrive from birth… Providing students an opportunity to learn from birth is as much—if not more—the responsibility of mayors, county commissioners and city council members as it is superintendents, school boards, principals, teachers, and parents.”

The report continues: “A  new day requires that we no longer promote the false narrative that the American public education system is a failing proposition, which inaccurately places blame and policy focus on regulating principals, educators, students and parents… A new day requires that we take a more student-centered approach and commit to improving living environments as well as learning environments.”

The project begins with 10 cities judged on 24 indicators representing supports necessary for academic and economic success…”Ideally, we believe cities should achieve a  minimum of 80% of possible points for indicators of healthy living and learning to be considered a model Loving City….”  Today, with 52%, Minneapolis and Long Beach are the highest scorers, with Buffalo at 50% a close second.

Schott’s Loving Cities Index rates cities by CARE indicators–health resources and physical environment (prenatal health, in-school support staff, clean air, healthy food, health insurance, parks, and mental health), and rates schools by COMMITMENT indicators—school policies and practices fostering the development of each student’s unique potential (preschool suspension alternatives, K-12 suspension alternatives, school-to-prison alternatives, K-12 expulsion alternatives, anti-bullying, and early childhood education).

In a preface to the report, the Rev. Dr. William Barber, leader of Repairers of the Breach explains: “A large and growing body of research shows a clear connection between economic and racial inequality and opportunity gaps in areas like housing, health care and community involvement… The Loving Cities Index provides a frame to align policy-makers, philanthropy, and community members around a supports-based agenda, recognizing that the standards-based approach that has dominated education reform… for decades has failed to provide students an opportunity to learn.”

Harvard’s Daniel Koretz Indicts High Stakes Testing in “The Testing Charade”

Daniel Koretz’s new book, The Testing Charade: Pretending to Make Schools Better, is a scathing indictment of our society’s test-and-punish school regime, formalized in the 2002 No Child Left Behind Act and continuing in the most recent version of the federal education law, the Every Student Succeeds Act.  Koretz, the testing specialist, is not so critical of standardized testing itself as he is of the high stakes sanctions that Congress attached to the annual tests in No Child Left Behind—punishments that have driven massive pressure on educators that has ruined our public schools:

“Pressure to raise scores on achievement tests dominates American education today. It shapes what is taught and how it is taught.  It influences the problems students are given in math class (often questions from earlier tests), the materials they are given to read, the essays and other work they are required to produce, and often the manner in which teachers grade this work. It determines which educators are rewarded, punished, and even fired. In many cases it determines which students are promoted or graduate. This is the result of decades of ‘education reforms’ that progressively expanded the amount of externally imposed testing and ratcheted up the pressure to raise scores.” (p. 1)

Daniel Koretz’s biography at the Harvard Graduate School of Education describes him as an expert on educational assessment and testing policy, and the book describes in considerable detail just how high stakes punishments for schools and teachers have corrupted the results of the tests themselves, narrowed the curriculum, and degraded teaching.

But my deepest interest in the book is Koretz’s depiction of how the testing that was supposed force teachers and schools to better serve poor children, raise their test scores and close achievement gaps has instead truncated opportunity for the very children it was supposed to help. How has test-and-punish narrowed the curriculum to basic reading and math in the poorest schools, and how has it forced teachers to focus on test-prep and coaching instead of enrichment?  How has test-and-punish forced the closing or charterizing of schools in poor neighborhoods? How has evaluating teachers by their students’ test scores resulted in firing principals and teachers in the poorest schools and exacerbated staff turnover?  And what about the children being held back in third grade due to a test score—even when they may be making real progress in reading and the adolescents denied a high school diploma?

Under current federal law, students and schools are given credit for proficiency only when children reach benchmark proficiency scores. A fourth grader who advances during the school year from a first to a third grade reading level will still fail to achieve the fourth grade cut score. Neither the child nor the teacher will be given credit for the child’s improvement: “One aspect of the great inequity of the American educational system is that disadvantaged kids tend to be clustered in the same schools. The causes are complex, but the result is simple: some schools have far lower average scores…. Therefore, if one requires that all students must hit the proficient target by a certain date, these low-scoring schools will face far more demanding targets for gains than other schools do. This was not an accidental byproduct of the notion that ‘all children can learn to a high level.’ It was a deliberate and prominent part of many of the test-based accountability reforms…. Unfortunately… it seems that no one asked for evidence that these ambitious targets for gains were realistic. The specific targets were often an automatic consequence of where the Proficient standard was placed and the length of time schools were given to bring all students to that standard, which are both arbitrary.” (pp. 129-130)

Reformers decided that, if sufficiently pressured to raise test scores, teachers would be able to do so: “(T)hey acted as if… (schools alone could) largely eliminate variations in student achievement, ignoring the impact of factors that have nothing to do with the behavior of educators—for example, the behavior of parents, students’ health and nutrition, and many characteristics of the communities in which students grow up.” (p. 123-124) Koretz explains at length and in detail the ways that teachers and principals whose jobs are threatened have resorted to raising scores—coaching for the test, drilling on materials likely to be covered, and in some cases where the pressure was greatest, cheating by erasing and correcting scores.

Koretz quotes Linda Darling-Hammond’s characterization of test-and-punish school accountability: “the kick the dog harder model of education reform.” And he explains: “If we are going to make real headway, we are going to have to confront the simple fact that many teachers will need substantial supports if they are going to markedly improve the performance of their students… And the range of services needed is broad. One can’t expect students’ performance in schools to be unaffected by inadequate nutrition, insufficient health care, home environments that have prepared them poorly for school, or violence on the way to school.” (p. 201)  He suggests first that we stop judging all students and schools by benchmark scores. We must “set goals based on students’ growth, not the level of their performance.” (p. 235)

In the Washington Post, Valerie Strauss interviews Koretz about his new book, and she publishes an excerpt.

While I have emphasized the sections in which Koretz shows test-and-punish hurting the schools that serve the poorest and most vulnerable children, Koretz is a testing expert, whose primary interest is how high stakes punishments attached to a regime of universal testing have corrupted the entire operation of public schools: “Reformers may take umbrage and say that they certainly didn’t demand that teachers cheat. They didn’t, although in fact many policy makers actively encouraged bad test prep that produced fraudulent gains. What they did demand was unrelenting and often very large gains that many teachers couldn’t produce through better instruction, and they left them with inadequate supports as they struggled to meet these often unrealistic targets. They gave many educators the choice I wrote about thirty years ago—fail, cut corners, or cheat—and many chose not to fail.” (p.244)

Koretz joins a growing number of critics who indict test-and-punish school accountability. What is significant about this book is the thorough and relentless critique by a testing expert who carefully and sometimes technically dissects the evidence.

Clueless Betsy DeVos Blames School Teachers, Doesn’t Get that Test-and-Punish Is Core Problem

After our new U.S. Education Secretary Betsy DeVos visited a Washington, D.C. middle school last week, she insulted the teachers there.  She said the teachers were “in receive mode,” and continued: “’They’re waiting to be told what they have to do, and that’s not going to bring success to an individual child,’ DeVos told a columnist for the conservative online publication Townhall. ‘You have to have teachers who are empowered to facilitate great teaching.’”

Let me point out that I have not noticed this “receive mode” among the teachers I know here in Ohio. Just last week Melissa Cropper, President of the Ohio Federation of Teachers (OFT), sent out a call for “activism now.”

Ohio requires far more testing than the annual test that was mandated by No Child Left Behind.  The new Every Student Succeeds Act  offers a way for states to develop their own accountability plans and a way to reduce—at least somewhat—over-reliance on test-and-punish.  Cropper is protesting the inaction of the Ohio Department of Education, which has just provided evidence that it will ignore the opportunity for states to have more latitude for shaping their plans for educational accountability rather than just have punitive sanctions imposed on them by the federal government. Patrick O’Donnell of the Plain Dealer reports: “Ohio’s proposed new state education plan under ESSA… avoids making any changes in state tests or even any recommendations, despite complaints of excessive testing of students dominating surveys and feedback sessions across the state.”  O’Donnell adds that Ohio’s draft plan isn’t final.

Cropper castigates the draft plan: “This plan is devoid of an overall vision for education and does nothing to move Ohio away from a testing culture and towards a culture that is more responsive to the needs of children.”  Why, wonders Cropper, does the Ohio Department of Education intend to submit its empty draft to the federal government on April 3, despite that the state doesn’t really have to submit its final draft until September 18?  Is the state rushing this along to avoid public input and discussion?

Cropper urges school teachers and members of the public: “Continue your activism. Take the online ESSA survey now.  In each section, feel free to add whatever comments you might have about the topic, but make sure to include something that indicates that the plan does nothing to change our current testing culture and that the state needs to wait until September to submit so that it can be rewritten to reflect the vision Ohio wants for its students.”  She adds that the Ohio Department of Education will accept comments until March 6.

Bill Phillis, Executive Director of the Ohio Coalition for Equity & Adequacy of School Funding amplifies Cropper’s plea for engagement by forwarding an e-mail notice from the Legislature’s Joint Education Oversight Committee, which is also holding hearings on Ohio’s ESSA draft plan: “The Joint Education Oversight Committee will be hearing testimony regarding Ohio’s State Plan for the Every Student Succeeds Act.  JEOC will hold two meetings on Thursday, March 2, 2017 at 2:30 PM and Thursday, March 9, 2017 at 1:30 PM in the Senate South Hearing Room. If you are interested in testifying please contact Haley Phillippi,  haley.phillippi@jeoc.ohio.gov or 614-466-9082 and indicate a date preference.” People wishing to testify should send their testimony to Phillippi 24 hours prior to the meeting.

The reason I was so amazed to hear Betsy DeVos criticize teachers as “in receive mode” is that, as part of a local education coalition in my own community, month after month, I listen to our teachers complain about the burden of testing and test prep on them and the students in their classes.  The teachers in our coalition were the people who demanded that we all read Alfie Kohn’s The Schools Our Children Deserve, a plea for a return to progressive education.

While Betsy DeVos insulted teachers last week as “in receive mode,” in my community and my state, teachers are dismayed and up in arms about what they are receiving. Here in the words of Steve Nelson’s new book about progressive education—First Do No Harm, is the kind of pressure our teachers are irate about receiving from the U.S. Department of Education and the Ohio Department of Education: “Public schools all over America are judged by the standardized test results of their students. In many, perhaps most, communities the test results are published in local newspapers or available online. The continued existence of a school often depends on its standardized test scores… Neighborhood public schools are labeled ‘failing’ on the basis of test scores and closed, often to be replaced by a charter operation that boasts of higher test scores… What has occurred is a complex sorting mechanism.  The schools, particularly the most highly praised charter schools do several things to produce better scores…. (S)tudents  are suspended and expelled at a much higher rate than at the ordinary public schools in their neighborhoods. Several studies show that charter schools enroll significantly fewer students with learning challenges or students whose first language is other than English.” (pp. 68-69)  All this pressures school administrators to force teachers to teach to the test at all cost.

Steve Nelson’s definition of progressive education is exactly what the teachers in my community’s elementary, middle and high schools are demanding: “While the distinctions between progressive education and conventional education are not always stark, it is reasonable to differentiate between ‘education and training,’ between ‘learning and being taught,’ and between ‘discovery and instruction.’ Conventional schools tend toward training and instruction, while progressive schools insist on learning and discovery. Perhaps the most powerful and misunderstood facet of progressive education is the notion of democracy. Progressive schools see themselves and their students as inextricably connected to the society in which they operate. The problems and fascinations of the world around them are the problems and fascinations they examine.” (p.11) He adds: “Education should cultivate the capacity to recognize and create beauty. School is a place where empathy and compassion should be honored and developed. The flames of curiosity should be fanned, not smothered. Skepticism should be sharply honed.” (p. 48)

The teachers I know describe how they slip progressive projects and exploration in around the edges of the demands made on them to prepare children for tests.  They also manage to save enough energy to respond when Melissa Cropper of OFT asks them to speak up for a better Ohio ESSA Plan.  We must join them in speaking up.

We should also remind Betsy DeVos again and again that by reducing test-and-punish she could help everybody at school—superintendents, principals, teachers and children—escape education “in receive mode.”  If Betsy DeVos were honestly concerned that too many students are being trained and taught and instructed and that they are in schools that fail to emphasize deeper education—discovery, examination, problem solving, skepticism, curiosity and compassion, Betsy DeVos would be absolutely in agreement with the school teachers I know.

If Betsy DeVos really believed in progressive education, as Secretary of Education she could use her powerful position to support  teachers as they excite children’s curiosity and support their personal interests and development.

Wisdom Prevails, for Once, as Board Members Set State Math Test Cut Scores

This year Ohio threw out the PARCC Common Core test and adopted another test designed and administered by the American Institutes of Research (AIR).  But changing tests did not solve everybody’s problems. Here’s what happened, according to Patrick O’Donnell, the education reporter for the Plain Dealer:

“Preliminary test scores on Ohio’s new Geometry and Integrated Math II exams… show that the tests were such a mismatch with student ability that fewer than one out of every four students who took them met state benchmarks.  The state had predicted that 59 percent of high schoolers would score as ‘Proficient’ or above on the Geometry exam, but only 24 percent did.  Similarly, 56 percent were projected to score as ‘Proficient’ or above on the Integrated Math II exam and only 21 percent did.”

Ohio uses the tests as one of the factors that qualify a student for high school graduation: “Ohio has a few paths students can take to qualify for graduation, but the main pathway calls for them to earn ‘points’ toward graduation based on their ratings on state tests—one point if Limited, two for Basic, three for Proficient, four for Accelerated and five for Advanced.  High school students need 18 points to graduate.  With seven math, English and social studies tests required….”

Cut scores on standardized tests are, of course, a political calculation. There is nothing scientific about the measurement of proficiency. Tests can be made really hard or really easy.  And politicians, not psychometricians, set cut scores that determine who passes and who fails.  Too often the people setting the scores in the test-and-punish ethos that has dominated our society are anxious to “protect the quality of the diploma” and guarantee “college-and-career-ready.” There hasn’t been so much worry about the quality of the schooling experience itself.

In a follow-up report, Patrick O’Donnell quotes some Ohio politicians as they describe why they support setting tough cut scores.  Andrew Brenner, chair of the House Education Committee (and the man who calls public education “socialism”), says tough cut scores motivate students to work harder: “I think you’ll see a potential major improvement in the scores.” State school board member Todd Jones condemns adjusting the cut scores because it’s too permissive and he believes school ought to be tough:   “It’s the trophies-for-all movement… Students… can take classes again, get tutoring and take the tests again to earn better scores. They’re going to re-take it and re-take it and get remediation.”

Showing better sense than usual, “Calling the lower results ‘outliers’ that need attention, the Ohio Department of Education… (asked) the state school board to adjust the cut scores so that 52 percent of students would be rated Proficient or above on the Geometry test and 35 percent Proficient or above on Integrated Math II.”

At this week’s hearing, A.J. Wagner, a member of the state board of education and a retired Dayton judge, spoke on behalf of the students whose scores would prevent their eventual high school graduation: “Wagner told the board that about 40 percent of high school students are not scoring well enough on required tests to graduate from high school. ‘All I hear is more rigor, more rigor, and they’ll try harder. But what if they don’t? I’m worried about those kids. I don’t know where they go.'”

Wagner acknowledges the reality described by Russell Rumberger in a publication of the American Psychological Association for students who fail to earn a high school diploma: “Dropouts face extremely bleak economic and social prospects. Compared to high school graduates, they are less likely find a job and earn a living wage, and more likely to be poor and to suffer from a variety of adverse health outcomes. Moreover, they are more likely to rely on public assistance, engage in crime and generate other social costs borne by taxpayers.”

And there is another reality that nobody talks about much when they consider the meaning of the standardized testing.  Not all schools offer the same math and science classes.  The U.S. Department of Education Office for Civil Rights published data earlier this month about course offerings in the schools that serve primarily black and Latino students: “Black and Latino students are 38% of students in schools that offer Algebra II and 37% of students enrolled in Algebra II; they are 36% of students in schools that offer calculus and 21% of students enrolled in calculus….”  What the Department’s data doesn’t cover are the disparities in course offerings across wealthy suburbs, urban areas, and small and very tiny towns.  Not all students in public schools have access to the same opportunities to learn. Students on track to enroll in the most advanced high school math courses are likely to score better on standardized tests—even tests of basic geometry—than students in high schools that provide only minimal courses.

Mercifully, after considerable conversation, on Tuesday, the Ohio State Board of Education corrected the cut scores in the state math exams in which scores this spring had come in alarmingly lower than what had been predicted:  “In the end, the board voted 11 to 5 to adjust the scores that students need for the new high school geometry and Integrated Math II end-of-course exams…  But the board made no changes to expectations on other state tests, despite a passionate plea from board member A.J. Wagner of Dayton to lower expectations across the board to insure that students can graduate.”

Another Former Supporter of Test-Based Accountability Confesses His Error

After more than a decade of federal test-and-punish education policy, true believers in the schemes spun by corporate education reformers are reevaluating how it has all worked out since the passage of the No Child Left Behind Act.  One at a time they are changing their minds.  Most notable for recanting her original support is the education historian Diane Ravitch, who has written two books and conducts a daily blog to demonstrate all the ways she was mistaken.

Now Harold Kwalwasser, the former general counsel for the Los Angeles Unified School District, the man who was responsible for handling the dismissal of weak teachers, confesses his error: “One major problem was that we lacked objective measures of teacher effectiveness.  So when the 2001 No Child Left Behind Act brought the nation annual standardized testing for math and reading, I applauded… But 14 years on, I think that’s a mistake.  I believe our exam system is deeply flawed, especially when it comes to to teacher evaluation.”

Kwalwasser makes his case simply and logically: “First, the results are too variable.  Teachers may one year be rated ‘highly effective’ while the next year they are merely ‘effective’ or worse, even though there are no observable changes in their teaching skills or strategies… Second, there is reason to doubt the relationship between test scores and an individual teacher’s competence… Third, we have the vagaries of student class assignment… None of the above even takes into consideration the segregation by race or class of school populations because of the continued (indeed increasing) segregation of housing patterns…. Fourth, the tests are too narrow in scope.  They largely focus on math and reading…. Finally, there is the little matter of the ‘cut score.'”  Because cut scores are usually set artificially high to motivate teachers and students alike to try harder, there is noting objective or scientific about a cut score. “So teacher evaluations are at times as much a statement about politics as teaching ability.”

Kwalwasser is not convinced that standardized tests are necessary at all for the evaluation of teachers.  “Before standardized tests, some districts had great evaluation and professional development programs that weeded out low performers.  Others did not.  Adding test data can’t turn weak programs into effective ones…”

In an endorsement of grade span testing as an alternative to the grind of annual standardized tests, Kwalwasser concludes: “Civil rights advocates worry that without standardized tests, the troubling disparities in our public education system will sink back into the mists…. I concur… Testing at the end of fourth and eighth grade can meet that need.”

Kwalwasser’s critique is sensible and principled, without the rhetoric that clouds today’s usual conversations about education policy.  His standard is the impact of public policy on the people in the schools—the students and their teachers: “Holding teachers and schools accountable is important, but the means should be accurate and fair.  The current standardized test program doesn’t pass muster.”  I urge you to read Kwalwasser’s piece carefully.