(Guest Post by Matthew Ladner)
The 2017 NAEP will be released next week, and a few notes seem in order. Over time, the term “mis-NAEPery” has slowly morphed into a catchall phrase to mean “I don’t like your conclusions.” Mis-NAEPery however has an actual meaning- or at least it should- which ought to be something along the lines of “confidently attributing NAEP trends to a particular policy.”
Arne Duncan for instance took to the pages of the Washington Post recently in order to lay claim to all positive NAEP trends since 1990 to his own tribe of reformer (center left):
Lately, a lot of people in Washington are saying that education reform hasn’t worked very well. Don’t believe it.
Since 1971, fourth-grade reading and math scores are up 13 points and 25 points, respectively. Eighth-grade reading and math scores are up eight points and 19 points, respectively. Every 10 points equates to about a year of learning, and much of the gains have been driven by students of color.
Duncan then proceeds to dismiss the possibility that student demographics had anything to do with this improvement, as the American student body has grown “It should be noted that the student population is relatively poorer and considerably more diverse than in 1971.” This is a contention however deserving dispute, given that the inflation adjusted (in constant 2011 dollars) income of the poorest fifth of Americans almost doubled between 1964 and 2011 once various transfers (food stamps, EITC etc.) have been taken into account. Any number of other things could also explain the positive trend, both policy and non-policy related, but never mind any of that, Mr. Duncan lays claim to all that is positive.
Duncan was not finished yet, however, as he was at pains to triangulate himself away from those nasty people who support more choice than just charter schools:
Some have taken the original idea of school choice — as laboratories of innovation that would help all schools improve — and used it to defund education, weaken unions and allow public dollars to fund private schools without accountability.
Well that sounds a bit like how a committed leftist would (unfairly) describe my pleasant patch of cactus. Arizona NAEP scores, could you please stand to acknowledge the cheers of the audience:
So the big problem in that chart are the blue columns. These charts stretch from the advent of the Obama years until the (until Tuesday) most recently available data. We won’t be getting new science data this year, so ignore the last two blue columns on the right. What we are looking at is changes in scores of 1 point in 4th grade math, -1 point in 8th grade math, 1 point in 4th grade reading and two points in 8th grade reading. There’s only one state that made statistically significant academic gains on all six NAEP tests during the Obama era, but it just so happens to be one of the ones adopting the policies uncharitably characterized by Duncan’s effort at triangulation.
There were some very large initiatives during these years- Common Core standards, teacher evaluation, etc. and we can’t be sure why the national numbers have been so flat, but let’s just say that a net gain of three scale points across four 500 scale point tests fails to make much of an impression. Supporters of the Common Core project for instance performed a bit of a Jedi mind trick around the 2015 NAEP by noting that scores were also meh in states that chose not to adopt, and that 2015 was early yet. Fair enough on the early bit, but the promise of an enormous investment of political capital in the project was not that adopting states would be equally meh, but rather that things would get better.

Where’s the BETTER?!?
Duncan’s misNAEPery however is of the garden variety- there has been far worse. Massachusetts for instance instituted a multi-faceted suite of policy reforms in 1993, and their NAEP scores increased from a bit better than nearby New Hampshire to two bits better than New Hampshire and tops in the country. So far as I can tell, there was approximately zero effort to establish micro-level evidence on any of the multiple reform efforts, or to disentangle to the extent policies were having a positive impact, which policies were doing what. That would be silly- everyone knows that standards and testing propelled MA to the top NAEP scores, and once everyone else does it we will surge towards education Nirvana Canadian PISA scores. Well, I refer the honourable gentleman to tiny blue columns in the chart I referenced some moments ago.
This is not to say that I am confident that testing and standards had nothing to do with MA’s high NAEP scores. I’m inclined to think they probably did, but some actual evidence would be nice before imposing this strategy on everyone. In Campbell and Stanley terms “Great Caesar’s Ghost! Look at those Massachusetts NAEP scores!” lacks evidence of both internal and external validity. In other words, we don’t know what caused MA NAEP scores, nor do we know who if anyone else might be able to pull it off, assuming policy had something to do with it.
So beware of mis-NAEPery my son- the jaws that bite, the claws that catch! Also beware of NAEP nihilism. Taking off my social science cap, I will note that NAEP is an enormous and highly respected project and it is done expressly for the purpose of making comparisons. Yes we should exercise a high level of caution in so doing, and should check any preliminary conclusions reached against other sources of available evidence. The world is a complicated place with an almost infinite number of factors pushing achievement up or down at any point. There is a great deal of noise, and finding the signal is difficult. NAEP alone cannot establish a signal.
The fact that the premature conclusions drawn from the Massachusetts experience lacked evidence of internal and external validity did not mean that those conclusions were wrong but it did make them dangerous. Alas the world does not operate in a random assignment study. Policymakers must make decisions based upon the evidence at hand, NAEP and (hopefully) better than NAEP. The figure at the top of this post makes use of NAEP and there is a whole lot of top map green (early goodness) turning into bottom map purple (later badness) going on. This is a bad look assuming part of what you want out of your support of K-12 education is kids learning about math and reading in elementary and middle school. Let’s be careful, but let’s also see what happens next.