NAEP Scores Stagnate; Test-and-Punish Flops; But Duncan’s New Plan Fails to Change Course

The biennial NAEP scores were released yesterday.  Diane Ravitch knows a lot about the National Assessment of Educational Progress, the NAEP.  Appointed by President Bill Clinton, she served on the National Assessment Governing Board for seven years. She describes what this test is: “NAEP is an audit test. It is given every other year to samples of students in every state and in about 20 urban districts. No one can prepare for it, and no one gets a grade. NAEP measures the rise or fall of average scores for states in fourth grade and eighth grade in reading and math and reports them by race, gender, disability status, English language ability, economic status, and a variety of other measures. The 2015 NAEP scores showed no gains nationally in either grade and in either subject… The best single word to describe NAEP 2015 is stagnation.”

Ravitch describes what she believes is the meaning of this year’s scores, and I agree with her: “For nearly 15 years, Presidents Bush and Obama and the Congress have bet billions of dollars—both federal and state—on a strategy of testing, accountability, and choice.  They believed that if every student was tested in reading and mathematics every year from grades 3 to 8, test scores would go up and up. In those schools where test scores did not go up, the principals and teachers would be fired and replaced. Where scores didn’t go up for five years in a row, the schools would be closed. Thousands of educators were fired, and thousands of public schools were closed, based on the theory that sticks and carrots, rewards and punishments, would improve education.”  But it hasn’t worked.

Carol Burris, the retired NY high school principal and now executive director of the Network for Public Education, interprets the 2015 NAEP scores: “NAEP is a truth teller. There is no NAEP test prep industry, or high-stakes consequence that promotes teaching to the test. NAEP is what it was intended to be—a national report card by which we can gauge our national progress in educating our youth.  During the 1970s and ’80s, at the height of school desegregation efforts, the gap in scores between our nation’s white and black students dramatically narrowed. You could see the effects of good, national policy reflected in NAEP gains. The gaps have remained, however, and this year, the ever so slight narrowing of gaps between white and black students is due to drops in the scores of white students—hardly a civil rights victory.”

Last weekend, U.S. Secretary of Education Arne Duncan announced what some people have seen as a significant pivot in education policy—a turning away from reliance on so much testing and a limit of 2 percent on the amount of time students are spending taking standardized tests at school. Here is what the U.S. Department of Education’s press release says about the anticipated change: “In too many schools, there is unnecessary testing and not enough clarity of purpose applied to the task of assessing students, consuming too much instructional time and creating undue stress for educators and students. The Administration bears some of the responsibility for this and we are committed to being part of the solution.”

The Department’s new Testing Action Plan describes seven principles that will govern a new testing policy whose formal guidance document will be released in 2016: Tests should be “worth taking,” “high quality,” “time-limited,” “fair—and supportive of fairness,” “fully transparent to students and parents,” “just one of multiple measures,” and “tied to improved learning.”  Duncan proposes to make funds available to help states and school districts “develop less-burdensome assessments,” to review their tests and make them more innovative, to pay for experts to guide states on how to reduce time on testing, and to provide technical assistance and even technical assistance centers and labs to “provide targeted assessment audit support.”

Duncan also indicates that the Department of Education will be more flexible in its demands relating to testing and its uses: “The Administration will invite states that wish to request waivers of federal rules that stand in the way of innovative approaches to testing to work with the Department to promote high-quality, comparable, statewide measures.”  Duncan continues: “The Department will work with external assessment experts to implement a more transparent assessment peer review process of state assessments… To avoid double-testing of students, the Department will offer states flexibility from No Child Left Behind’s requirement that all 8th graders be tested on the same, statewide 8th grade math and reading tests, when such students are taking advanced high-school level coursework in 8th grade.”  He adds: “The Administration has adjusted its policies to provide greater flexibility to states in determining how much weight to ascribe to statewide standardized test results in educator evaluation systems required under the Administration’s ESEA flexibility policy.”

The Department of Education’s proposal to adjust its testing policies follows a major report released last week by the Council of the Great City Schools that deplores the amount of standardized testing being required by the federal government and states.  The report explains that, “401 unique tests were administered across subjects in the 66 Great City School systems.  Students… were required to take an average of 112.3 tests between pre-K and grade 12… The average student in these districts will typically take about eight standardized tests per year… Some of these tests are administered to fulfill federal requirements under No Child Left Behind, NCLB waivers, or Race to the Top, while many others originate at the state and local levels… Testing pursuant to NCLB in grade three through eight and once in high school in reading and mathematics is universal across all cities.  Science testing is also universal according to the grade bands specified in NCLB.  Testing in grades PK-2 is less prevalent than in other grades, but survey results indicate that testing in these grades is common as well… Urban school districts have more tests designed for diagnostic purposes than any other use, while having the fewest tests in place for purposes of international comparisons… Some 39 percent of districts reported having to wait between two and four months before final state test results were available at the school level, thereby minimizing their utility for instructional purposes… There is some redundancy in the exams districts give… The findings suggest that some tests are not well aligned to each other, are not specifically aligned with college-or career-ready standards, and often do not assess student mastery of any specific content.”

I don’t believe the Department’s change of plans will significantly address the challenges described so thoroughly by the Council for the Great City Schools.  America’s educational philosophy, formalized in 2002 in the federal No Child Left Behind Act (NCLB) and perpetuated for more than a dozen years now in federal policies like Race to the Top and the NCLB Waivers, is a two-pronged strategy: first test and then punish the districts and schools that cannot quickly raise scores on the tests.  It is the PUNISH strand of this policy that has caused the explosion of testing.  NCLB’s mandate of an annual test was the mere beginning.  Fear is the motivator in a system driven by sanctions and punishments.  In the school districts where scores are lowest, school officials, driven by fear, have added practice tests and benchmark tests and more practice tests and hours of test-prep—anything that might raise test scores. If students are tested again and again, says the logic of a test-and-punish plan, maybe students will get better at taking tests and their scores will rise.

Here are the problems I see in the new testing ideas announced last weekend by the Department of Education.

  • In the first place, as Diane Ravitch points out so clearly, limiting testing to 2 percent of the school year is not really much of a change: “Actually that wasn’t a true reduction, because 2% translates into between 18-24 hours of testing, which is a staggering amount of annual testing for children in grades 3-8 and not different from the status quo in most states.”
  • Secretary Duncan’s new plan does not cut testing as the centerpiece of current proposals in Congress for the reauthorization of the Elementary and Secondary Education Act (that we currently call NCLB). Duncan proposes neither to move from annual testing to grade-span testing (once in elementary, middle and high school) nor to eliminate the high stakes for school districts and schools (and their teachers) unable quickly to raise students’ scores.
  • High stakes will continue to frighten school district officials into narrowly focusing on the tested subjects of reading and math and to fill too much time with test prep lessons.
  • Somehow, says Duncan, there is to be more flexibility about basing teachers’ evaluations on students’ scores, particularly for teachers in subjects that are not tested. This presumably means we won’t be adding more tests in previously non-tested subjects just to be able to judge teachers by the tests.
  • It seems to me that—quite typically for this Administration—there will be more money made available for consultants to evaluate and re-calibrate the testing system.

Duncan’s concept is to retain a test-and-punish system.  Writing for Education Week‘s federal policy blog, Andrew Ujifusa quotes Secretary Duncan’s own description of the new plans at a press conference: “The goal is to have good assessment that drives instruction, and if you reduce testing to 1 percent, and it isn’t relevant… it is not guiding instruction, that is a loss, that’s a failure, not a win.” Ujifusa paraphrases Duncan’s next comment: “However, if students spend slightly more than 2 percent of instructional time on testing and the assessments are helping teachers, parents understand them, and students are part of the solution, that’s a good outcome.”  This sounds to me as though we are going to continue with a test-based system.

Considering the stagnant NAEP test scores released yesterday, Kevin Welner, Director of the National Education Policy Center, comments on Duncan’s recently announced reassessment of the role of standardized testing: “It’s long past time to recognize that any benefits of test-based accountability policies are at best very small, and any meager benefits teased out are more than counterbalanced by negative unintended consequences.” “This (the focus on testing) is the tragedy. It has distracted policymakers’ attention away from the extensive research showing that, in a very meaningful way, achievement is caused by opportunities to learn. It has diverted them from the truth that the achievement gap is caused by the opportunity gap. Those advocating for today’s policies have pushed policymakers to disregard the reality that the opportunity gap arises more from out-of-school factors that inside-of-school factors… So schools with low test scores were labeled ‘failing’ and were shut down or reconstituted or turned over to private operators of charter schools… Teachers whose students’ test scores didn’t meet targets were publicly shamed or denied pay or even dismissed.  Our entire public schooling structure became intensively focused on increasing test scores. But once we admit that those test scores are driven overwhelmingly by students’ poverty- and racism-related experiences outside of school, then ‘failing’ schools are little more than schools enrolling the children in the communities that we as a society have failed.”