I Only Know One Truth-It is Time for Bossy McBossypants Testing to End

(Guest Post by Matthew Ladner)

Last spring I was on my bike and came across this in front of a local middle school. I found it striking enough to take a picture with my phone:

In case you are squinting at your iPhone, the sign says “AZ Merit Testing 4/17-5/3.” Now mind you that the test that these students take to determine whether or not they go to college, and if so what sort of college, takes 4 hours. Comprehensive exams for a Ph.D. took me three days. Somehow in the awesome logic of 2017 it came to pass that it would make some sort of “sense” to disrupt schools for two weeks to give…AZMerit.

I think we all know what tends to happen to the school year starting (in this case) 5/4.

Weeks later I had the opportunity to observe a number of focus groups held on parental choice policy. The groups were from different parts of the country, and included parents, teachers and opinion leaders. Despite the fact that the topic of the convening was never testing, everyone made their feelings on the subject clear during conversations. All groups everywhere deeply dislike the current practice of standardized testing.

I can’t emphasize the next point strongly enough: I never once heard anyone use the phrase “Common Core” or burst into a fit of conspiracy mongering. Rather what I saw repeatedly was that people feel that schooling has become overly fixated on test preparation. People have a rather strongly held belief that schooling is supposed to be more than test prep. Something has gone terribly wrong with education in their view, and they want it to stop. Across the groups I saw, the consensus seemed to be that we should drive a stake through the heart of the current system, fill the mouth with holy wafers, and then burn the sarcophagus to fine ash.

I have seen remarkably little evidence that today’s heavy-handed, standards based testing system is of much utility. There is some suggestive evidence that states that had been doing nothing on the testing front before NCLB got a modest bump in results when they started testing. They may however have received a similar bump from a system with a much lighter footprint. Moreover no less than Hanuskek and Loveless have concluded that the heavy-handed Common Core project resulted in approximately nothing in the way of improved student learning. Given that we live in a democracy, a lighter footprint system seems like a fine idea.

So here is mine:

Preserving campus level academic transparency should be the central goal of testing. The Demos would apparently be happy to sacrifice it in return for slaying the testing vampire, but it would be a terrible loss in my view. States can adopt whatever standards they want (I suggest the old Massachusetts standards) but give their students a three-hour national norm reference exam on the second to last day of school. The last day rather than last month of school can now be the write-off. Do a good job teaching the MA standards, your students will do well/show progress on the nnr test.

Some will want to have their state officials grade or otherwise label schools based on the results. Have at it-but it is worth noting that the defacto accountability system in this country has become the Greatschools rating system given that is where the eyeball traffic resides. State ratings have become little more than an obsession internal to the system. Some will want to continue on the troubled path of trying to move the number of teachers fired for low performance from 1% to 1.5%. My view is that this is an unworkable path to hold schools accountable, but if some state or locality wants to keep it up feel free.

I know some of you continue to feel motivated by the idea that standards are going to lead us to profound improvement and narrower achievement gaps. Decades into the project it is time to ask- where’s the beef? If you are willing to impose a deeply unpopular system of testing upon American families I must ask why? The burden of proof lies with you. If you (like me) would like to preserve campus level transparency I ask what is your plan? My plan is to adopt a system that is less intrusive and prescriptive and hold for dear life to campus level data-now tell me your plan. If your plan is to hold onto dear life to a system that the public abhors, I want to suggest that you need a new plan.

In my view, voting with your feet represents the most robust form of accountability by a very wide margin. I would like to have those voting decisions informed by test scores, and a great many other things including parent reviews (score another touchdown for Greatschools). Watching the focus group discussions made me realize that the United States House’s decision to enact a deeply misguided federal opt-out was not a fluke, but rather fit with the democratic sentiments of their constituents.

Opt-outs lead to nudge outs which leads to completely unreliable and thus worthless data. They will be passing at the state level soon unless transparency supporters pull their heads out of the sand. As Corwallis wrote to Clinton before the Battle of Yorktown “What is our plan? If we don’t have one, what are we doing here?”

Perhaps I’ve got this all wrong. If so, the comment section awaits.

This entry was posted on Tuesday, September 5th, 2017 at 8:05 am and is filed under NCLB, school accountability. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

21 Responses to I Only Know One Truth-It is Time for Bossy McBossypants Testing to End

Ze'ev Wurman says:

September 5, 2017 at 11:39 am

Amen.

Regarding that early “bump,” it seems unclear that even it was caused by testing–most testing regimes came together with standards, which didn’t exist prior to that point, so it is difficult to disentangle the two and point to testing as the actual cause.

But testing became a monster that drives everything. Having nationaly normed tests that are divorced from the local curriculum is helpful in eliminating teaching to the test on one hand, and encouraging a semblance of national uniformity on the other.

But the current testing regime is only preferred by educational researchers. Should we really have an educational system geared for them rather than for the families it’s supposed to serve?

Reply
matthewladner says:

September 5, 2017 at 3:52 pm

Thanks Ze’ev- I agree that it is difficult to disentangle that, and there were other policy changes going on at the time as well. If we decide to attribute the academic gains to standards and testing, I’m willing to compare DC’s growth on NAEP during this period to those previous non-testing states, as they used (gasp!) Stanford 9 as their “state test.”

Reply
Lynn Woodworth says:

September 6, 2017 at 10:37 am

Your assumption that no learning occurs after testing is based on what? That comes off very much with a “as everyone knows….” vibe.

Testing is both useful and informative in that it forces schools to look at the performance of sub-populations rather than anecdotally citing the performance of the majority as proof of their quality. Sites like Greatschools have a great danger of that fallacy.

I also take argument with the expression “teaching to the test.” Isn’t that what schools are supposed to do? I am not talking about teaching test taking skill rather than content. That is just bad teaching. Rather, teaching the information and skills which have been deemed by the state to be critical is in fact teaching to the test. That is the point of having state standards.

As to the quality of tests, there is no reason any state should still be using long form, paper and pencil proficiency tests. The technology exists for computer adaptive exams which can quickly (a few hours as opposed to a few days) and more accurately assess a student’s knowledge of the expected content. Further, these computer adaptive exams can provide rapid feedback to teachers. Under that setting, giving exams in late March or early April would allow teachers time to provide remediation to students who are lacking in achieving the expected level of knowledge and skills.

Rather than end testing over its flaws, let’s fix the flaws.

Reply
- Zeev Wurman says:
  
  September 6, 2017 at 11:32 am
  
  We have tried to fix the flaws for many years, and failed so far.
  
  All the benefits of testing in terms of disaggregation, etc., are also available with a short nationally normed tests. The only thing that will not be possible is holding teacher directly accountable for testing results, if the test doesn’t closely follow the local classroom curriculum. But ESSA already gave up on that anyway. In fact, the annual testing demand in ESSA has no rhyme or reason anymore because of it giving up on hopes to tie teacher evaluation to testing results, and because it gave up on “every student will be proficient” mantra. There is no logical reason anymore to even have an annual census test — sampling test would do as well for subgroup achievement. Same like NAEP does.
  
  As to teaching to the test and its effects, I also thought long ago that if the test is aligned with curricular goals, there is nothing wrong with “teaching to the test.” Today I have changed my mind. I didn’t realize that for many teachers and principals teaching to the test doesn’t mean teach the curriculum that will be tested, but rather it means teach (and practice ad nauseam) only how to answer test questions. Essentially a glorified Kaplan training in the classroom. Particularly to weak kids. Sad but true.
  
  Finally, regarding computer adaptive testing. Yes, the original promise was that CAT will significantly shorten the tests and produce results in almost real time. In reality, we see that the Smarter Balanced test is anywhere from 1.5 time to 3 times LONGER that the old tests (depending on the state) and the results still don’t make it before late August or early September. Perhaps because the idiotic insistence of the new tests on “authentic” items, perhaps because of the addition of unwieldy and meaningless “performance” items, perhaps because of general incompetence in the field. The only thing that those computer-based tests achieved so far is to put financial strain on school districts to rapidly buy computer equipment. I am sure Apple and Microsoft are happy.
  
  Reply
- matthewladner says:
  
  September 6, 2017 at 11:55 am
  
  It does have an “as everyone knows” vibe but only because I have heard it repeated by educators dozens of times. That is not to say that it happens everywhere, but I don’t have any reason to doubt that it goes on.
  
  You could look at subgroups with an NNR.
  
  I can tell you that if you think that the purpose of education is to teach to a test that you’ve decisively lost the public. In the focus groups I saw people would be willing to ditch standardized testing without much of a second thought. I think this would be a mistake, but again it was no accident that the US House passed a deeply inappropriate federal opt out provision by a wide bipartisan majority.
  
  Here in AZ we have the worst of both worlds- the test takes weeks to administer and then the parents get a report in the summer. The theoretical benefits to computer adaptive testing remain just that- theoretical.
  
  Reply
  - Greg Forster says:
    
    September 7, 2017 at 6:18 am
    
    “I can tell you that if you think that the purpose of education is to teach to a test that you’ve decisively lost the public.”
    
    But Matt, why should the government care what the public thinks?
  - matthewladner says:
    
    September 7, 2017 at 7:32 am
    
    ‘Cause democracy!
  - Greg Forster says:
    
    September 7, 2017 at 10:51 am
    
    No, no, you’ve got it all wrong. Amy Gutmann has specifically assured me that education is only democratic if professional educators tell voters and parents how to educate their children.
- Peter Cunningham says:
  
  October 3, 2017 at 4:32 pm
  
  I am with Lynn Woodworth. (No surprise from an Obama alum.)
  
  We should compare places that administered high standards and test-based accountability faithfully–like Massachusetts–to states that didn’t– which is most other places.
  
  Policy implemented badly is not a valid argument against the policy. Do bad charters argue for abandoning choice? Does the recent data on college students’ ignorance of the first amendment argue against teaching civics? Of course not.
  
  You say you have seen little evidence of the “utility” of testing, but hasn’t it forced a needed conversation about achievement gaps?
  
  You cite Hanushek and Loveless saying Common Core is a failure. But most states are still not fully implementing the standards — with better curriculum, teacher training, aligned assessments, etc. Those doing it well are making progress. Nobody promised the moon. That’s a straw man.
  
  You encourage states to adopt “whatever standards they want.” Isn’t that already the law? Wasn’t it always the law? It’s not a good law in my opinion but it’s still the law. The fact is, states play with the standards and cut scores and grad requirements and end up with Tennessee “standards.” Yee-hi.
  
  I’m a big fan of Great Schools. I serve on the board. But they don’t have any teeth to force states and districts to improve. You say voting with your feet is the best form of accountability. I agree it’s one good form of accountability but it’s not enough.
  
  Your headline and your opening suggests that you want to end testing but then you really don’t–you just want to end the stakes–but what are the stakes? How many schools close because of testing? How many teachers get fired because of testing? Pretty close to zero in both cases.
  
  You say the burden of proof lies with us. OK, pick a metric. HS grad rates, college enrollment/grad rates all rising. (I know, I know those can be gamed, but name a metric that can’t be gamed.) Do higher test scores count? Given that the public school population is poorer today than yesterday, holding flat is progress.
  
  Lastly, surveys show parents support testing. The “unpopularity” you reference is driven by locally-required over-testing (never the intention) and a campaign from unions and anti-reformers to deny us the data on performance in order to drive change.
  
  If you take away the accountability purpose of testing and keep it only for transparency, the anti-testing crowd will go the next step and say — “well gee, if there are no stakes, then what’s the point? Let’s just drop it completely.”
  
  Which puts us right back where we began before testing and accountability: in the dark.
  
  Reply
  - rpondiscio says:
    
    October 3, 2017 at 5:13 pm
    
    <<< Policy implemented badly is not a valid argument against the policy.
    
    Not after a day, a week, or a year. Even a couple of years. After *many* years that's exactly what it is: a valid argument against the policy.
    
    American education may be "unaccountable" but the logic of test-driven accountability assumes that there broad general competence that needs to be unlocked, directed and incentivized. I see little evidence for this belief.
  - matthewladner says:
    
    January 16, 2018 at 2:26 pm
    
    Peter-
    
    I missed your comment when you left it, but I’ll offer a belated response. First off we have no what policy-if any- caused MA’s already high NAEP scores to edge up. The 1993 reform package included multiple reforms, and while it may be the case that only a criterion based test would have contributed to the gains, that may not be the case at all. A reform strategy whose proponents have only one state to point to, and not much in the way of suggestive evidence that it actually made a large difference in that state, is not in the best shape.
    
    I want to preserve campus level academic transparency, but question the value added to current practices vis a vis realistic alternatives.
    
    As for opinion polls, different polls will tell you different things based on everything from question wording to framing provided in previous questions. The focus groups I watched were not about testing, and I watched the moderator try to talk the groups into preserving some kind of campus level data. One entire panel sat staring at the table as the moderator asked them about whether they wanted campus level data, with several people shaking their heads “no.”
    
    The hour is later than the social justice bubble thinks imo.
Greg Forster says:

September 7, 2017 at 6:23 am

Improving Greatschools would do more than all standards and testing systems combined. It’s great to have it but there are a lot of ways it could be improved. Ironically, the biggest problem is that test score levels play too big a role in its ratings.

See also.

Reply
seanrickert says:

September 10, 2017 at 9:10 am

There is an underlying question that you don’t address, but which must be part of any discussion about moving past standardized testing. For what purpose were standards and thereby standardized testing established? And, do we still care to achieve the goals that standardized tests were developed to meet?

There are many schools of thought explaining why standardized testing exists. They range from the Binet IQ test based theory that we test students to sort them to the accountability based notion that we owe parents valid information about the quality of their child’s school and this can be achieved through standardized testing. Given the current practices employed in testing student performance in Arizona, these theories are both complete hokum. If they comprised the whole purpose for standardized testing, your assessment and recommendations would be correct. What is lost is the deeper purpose behind testing and the potential for improvement that qualitative data about student performance offers to teachers, schools, districts and entire education systems.

First, there is a deeper purpose behind standards and standardized testing. At a point in the distant past there was no such thing as a standardized test or state standards. Teachers in the classroom did what they felt was best, principals determined if what they taught was effective, school boards assessed the quality of their schools and parents ran for the school board. There’s a lot that can be said for this system, but anybody who is advocating for its return needs to remember that it was not perfect. It created tremendous disparities of achievement. There were disparities between what was expected of wealthy children and poor children. There were disparities between what was expected of metro children versus rural children. Most importantly for some, there were tremendous disparities between what was expected of different racial and ethnic populations. Brown v. Board of Education inserted the notion that if you fail to equitably educate all children you are depriving some of access to the opportunities promised by our nation’s founding documents. This led to busing. It also led to standards. When you have standards there is a need for testing to determine if those standards are being mastered. In Arizona’s recent Lopez court case much of the information used to show that English Language Learners in Arizona were not receiving an adequate education came from standardized testing. Without the Arizona Instrument to Measure Standards (AIMS) it would have been impossible to show that one group of students was consistently underperforming and to argue that race based inequality of opportunity was the result of institutional practices. Standardized tests exist first and foremost as a guidepost to enable us to identify gross inequities within our education system. If we are confident that in our current system there are no inequities, or if we have decided that we no longer care if inequities exist, then we should eradicate the testing system for the anachronism that it has become. If there is a need to ensure we are not relegating some children to lives of poverty because they receive a subpar education we need a mechanism for monitoring the extent to which the child is being taught.

Second, there are ways to utilize test score data that go beyond school accountability. In the era of constant data it only makes sense that we are trying to utilize data to better understand student achievement. The final output of an accountability system based heavily on AzMERIT scores will not accomplish this goal. There needs to be a discussion about what could be achieved if we had a tool that was useful for teachers and schools. The current low cost option does a poor job. Schools are forced to supplement the weak data they receive from AzMERIT with DiBELs data on reading achievement, and Galileo or AWEA testing data on student progress in basic numeracy and literacy if they want to have a complete picture of the student’s ability and needs. In the pre-AzMERIT world the Arizona Department of Education proposed that we could have a single assessment system that would include integrated quarterly tests that would provide data showing student level data on strengths and weaknesses. This was going to be achieved through the PARCC examination system, and a portion of the cost of participating would be borne by the districts. In the end that goal was abandoned in favor of the lower cost current system. If we were committed to devoting the resources needed to develop an assessment system that provided information that truly informed instruction, there might be additional reasons for maintaining a system of statewide assessments. Within the current system AzMERIT sits as a cumbersome ineffective tool added to the myriad of other standardized tests that are administered. The other tests provide data that is useful.

I’ll probably offend somebody by comparing students to swine, but somebody much wiser than I once pointed out that you can’t put weight on a pig by weighing it more often. You put weight on by feeding a quality diet. The quality of the education we provide to our students has to be top priority. How do we give teachers the tools that they need to best meet the needs of their students? How do we adequately support them to ensure that the highly effective ones stick around and those who are still developing are either on a path to becoming highly effective or are moving out of the profession? How do we structure the time students spend at school to meet their myriad of needs including but not limited to academic, social, emotional, spiritual and physical growth? These are the questions that should be driving education decisions. We should not be focused on, “How can we improve test scores?”, but we know that too much of the time that is what happens. This is the issue that must be addressed, and I’m not sure throwing the baby (standardized testing) out with the bathwater (failures to address the important questions) is an effective strategy.

Reply
- Matthew Ladner says:
  
  September 19, 2017 at 3:27 pm
  
  Sean- thank you for your thoughtful reply. Please note that I am not calling for the abolition of testing but rather for a realistic path to preserving campus level academic transparency. I don’t see what the value add is for criterion based tests and more to the point the public seems deeply hostile to them. Some one could gather signatures to put an opt-out on the ballot and I’m guessing it would sail through regardless of what I, you or anyone else had to say about it.
  
  So- what is our plan? The status-quo is not likely an option.
  
  Reply
pdexiii says:

September 10, 2017 at 7:15 pm

Black and Brown children under-perform on these tests because a) They aren’t taught the content thoroughly to engage the tests, and b) because so many teachers abhor testing they don’t emphasize it in their pedagogy (‘projects’, homework, and ‘extra credit’ take more precedence) thus students don’t get the practice to be comfortable in a testing environment. Given that practically every profession or occupation requires some type of test for you to demonstrate proficiency, this approach by so many urban educators borders on educational malfeasance.

As Zeev Wurman said, it’s very frustrating to be forced into testing with over a month left in school, and we still don’t get the results until late August in spite of the electronic nature of the tests. For a state like CA I sense the layers of bureaucracy that have to process this data require such early testing yet late reporting, which is still unacceptable.

If I instruct the content and the students practice it faithfully, they’ll perform well on tests. As I’ve said before, though, I’ve never seen a standardized test tell me something about my students I didn’t already know.

Reply
rpondiscio says:

October 2, 2017 at 6:25 pm

It’s been 30 years. I suspect that if test-driven accountability was going to lead to a broad improvement in educational outcomes it would have done so by now. If we want to save the data we get from testing, it’s time to give ground–a lot of ground–before we lose the data too.

Reply
- Ze'ev Wurman says:
  
  October 2, 2017 at 7:47 pm
  
  Hey, Robert! We (finally!) agree! 🙂
  
  Reply
In Testing Backlash, Virginia Drops History Tests Despite Horrific Ignorance - The Right Side of News says:

October 4, 2017 at 5:29 am

[…] Dr. Matthew Ladner recently discussed observing focus groups questioned about various education policies: “All groups everywhere deeply […]

Reply
Thanks To Testing Backlash, Virginia Drops U.S. History Tests Despite Horrific Public Ignorance – Liberty REDUX says:

October 4, 2017 at 6:10 am

[…] Dr. Matthew Ladner recently discussed observing focus groups questioned about various education policies: “All groups everywhere […]

Reply
American Families Rebel Against K-12 Standardization with their Pocketbooks - EducationNews says:

September 28, 2018 at 5:45 pm

[…] families have deep misgivings about the conduct of public schooling, especially the modern practice of standardized testing. We should aspire to a schooling system that is a good deal more than providing test-prep custodial […]

Reply
Yesterday ‘accountability’ might have stood against the world says:

August 13, 2024 at 1:47 pm

[…] likewise has powerful opponents in the form of unionized employee interests but also suffers from broad lack of public enthusiasm. The effort to nudge schools into better learning was well-intended and had at one point broad […]

Reply