Anna Egalite, Jonathan Mills, and I have a new study in the journal, Improving Schools, in which we administer multiple measures of “non-cognitive” skills to the same sample of students to see if we get consistent results. We didn’t.
How students performed on a self-reported grit scale was uncorrelated with behavioral measures of character skills, like delayed gratification, time devoted to solving a challenging task, and item non-response. These are all meant to capture related (although not identical) concepts, so they should correlate with each other. The fact that they don’t suggests that we still have a lot of work to do to refine our understanding of character skills and how best to measure them.
Angela Duckworth, who developed the self-reported grit scale, and David Scott Yeager, who is a pioneer in measuring growth-mindset, have been trying to warn the field that these measures are still in their infancy. They have an article in Educational Researcher and have been giving interviews emphasizing that while non-cog skills appear to be a very important part of later life success, our methods of measuring these concepts are still not very strong — certainly not strong enough to include in school accountability systems.
Our research showing the lack of relationship between behavioral and self-reported measures of character skills adds to the case for caution in using these measures for evaluation or accountability purposes. Remember, it took decades of research and practice to develop reliable standardized tests. A similar effort and patience will be required to develop reliable measures of character skills. And I suspect that even improved measures may be useful for research purposes but never robust enough to use for accountability.
Ed reformers can be dangerous if they are too much in a hurry. We unfortunately want to apply every new insight right away and lack patience for the careful development of policies and practices for long-term benefit. We also invest few resources in basic research that is essential for long-term gains. According to my analysis in a chapter in a new book edited by Rick Hess and Jeff Henig on education philanthropy, the largest education foundations only devote 6% of their funding toward research. And most of that research may really be short-term policy advocacy masquerading as research. The federal government is little better at making funds available for basic research.
Non-cog or character skills are incredibly important but if we are going to use these and other ideas to improve education, we are going to need a significant shift toward funding research and greater patience to bring those ideas to fruition.
So, you are showing that they are not reliable measures?
That seems to me to be putting the psychometric cart before the psychometric horse — as is almost always done.
Reliability is — quite famously — an upper bound on validity. But it does NOT logically follow that increasing reliability will increase validity. And yet, so often we act as though that happens automatically.
Just because a measure is reliable does NOT mean that it is valid. In fact, the ONLY reason why reliability matters is that it is an upper bound on validity. The only trait of tests that actually matters for its own sake is validity.
So, I understand that these measures are not highly reliable. But that does not tell us that they are any less valid than the cognitive tests we use, or the SGP or VAM outputs our statistical techniques give us.
I do not say this to suggest that we should be using these non-cog measures. Rather, I say this to point out that the differences in reliability does NOT indicate that they the tiniest but less appropriate for test-based accountability than the cognitive measures.
It never ceases to amaze me how the philanthropic class will demand measurable results but will not invest in developing measurements. I think they’re accustomed to thinking in terms of business where measurement is often (though not always!) easier. Even those aspects of business that involve more difficult, qualitative measures (marketing) are usually infuriating to the business owners. They just don’t believe in social science.
While more research is needed, there are far too few examples of quality research being translated into public policy. Were I handing out money I would want to see, er, evidence that K-12 research matters. I don’t like saying that; perhaps I can be enlightened as to where policy has been shaped by quality research.
As a reminder of the problem, just consider the slavish, near craven efforts by many D and R politicians to tell local schools how they are providing more funding. This after decades of enormous growth in real per pupil spending despite much research showing no causal link with achievement.
The media contribute to the problem. As one example, high quality school choice research rarely finds its way into the daily press. It contradicts the narrative of so many education reporters.
The research showing positive results from school choice has been extremely helpful defusing opposition to its adoption in new states. Only way we got to almost 60 programs serving 400,000 students.
Also Jay’s graduation rate research was indispensable in establishing the (now widely accepted) accurate dropout rate.
I wish I could be convinced that the gold standard research was indispensable. I am only personally familiar with a fraction of those programs; in none was the research that important. The main factors were solid organization and pure politics.
Excellent point on grad rates.
Well, it’s not a zero sum game in which “research mattering more” implies “politics matters less.” In the early years, there were only a handful of studies on a handful of topics, so the research was only one talking point. But as we promote and defend the new universal school choice program in Nevada, we are constantly talking about research – research on how choice impacts outcomes at public schools, research on the fiscal effect of school choice at public schools, research on how choice affects tolerance, research on racial segregation. In fact, we aren’t even talking very much about the “gold standard” research on outcomes for participants – the reason being that people are now pretty much ready to believe, without a lot of argument, that choice improves academic outcomes for the students who use it. That’s the result of many years of painstaking labor by people like Jay.
I so much wish that “people are now pretty much ready to believe, without a lot of argument, that choice improves academic outcomes for the students who use it.” For example, the conventional wisdom in the Wisconsin Legislature and WI media is that choice makes no difference.
I have cited your work for Friedman, Jay’s work, and the work of others often in recent years. In each case the need to do so originated with lack of knowledge about the research.
I am NOT diminishing the research nor the fact that it has had some impact. I don’t believe it’s been as decisive as you suggest.
In my experience, when the research has had an impact it’s been in reinforcing supporters. This of course is quite important especially as it involves elected officials. But when it comes to “defusing opposition,” well, perhaps some elected officials have flipped from opposition to support. To the extent that has happened I doubt it was decisive in enacting a program.
It is encouraging to hear that the research is helped and is helping to advance the Nevada legislation, a program that finally begins to move in the direction of real reform. Those who have worked and are working there are real trailblazers.