Fordham has a new report out that claims to have discovered three warning signs in charter applications that make those charters more likely to have low-performance in their initial years. If this were true, it would be a major development given that prior research has failed to find characteristics of charter applications that predict later academic outcomes. Unfortunately, a straightforward interpretation of the results in Fordham’s new report suggests that there are no reliable predictors of charter failure. Despite organizations like the National Association of Charter School Authorizers (NACSA) receiving millions of dollars from foundations and even receiving contracts from states to evaluate applications, there are no scientifically validated criteria for predicting charter failure from their applications.
The Fordham analysis obtained charter applications from four states. They then focused on the successful applications and identified 12 factors that could be coded consistently in charter applications that they thought might be related to future charter performance. The authors conducted a series of 12 logit regressions to see if any of these 12 factors were significantly related to charters being low-performing later on. Only one of those 12 factors was significantly related. Charter applications that said they intended to serve at-risk students were significantly more likely to be low performing on standardized tests. (See Table C-1) Other than that, no other factor was significantly related to charter performance.
So, Fordham might have concluded that if you want to avoid authorizing low-performing charter schools, stay away from charters that serve disadvantaged kids. Of course, this would be a little like advising people who want to be millionaires to first start with a million dollars. All that the finding reveals is that their measure of charter outcomes is a lousy measure that fails to capture how charters might really help disadvantaged students.
But don’t worry, Fordham never highlights the straightforward results presented in Table C-1. Instead, the authors engage in a convoluted exercise in data mining to see if they can’t turn up some more palatable and marketable results. So, they engage in a mechanical process of adding and removing these 12 factors and interactions of those 12 in a single model until they arrive at a “best fit.” This is exactly the type of atheoretical mining of data that we warn our students not to do. You should have variables in your model because you think they are theoretically related to the dependent variable, not because you tried every combination and this is the one that gave you the best fit.
In total there are 78 possible variables that could have been included if you try every one of the 12 variables plus every paired inter-action of those 12. By chance we would expect about 3 of those 78 variables would be statistically significant and sure enough the Fordham analysis finds three significant factors. This time they find that charters focused on at-risk students are more likely to fail if that is combined with failing also to propose intensive small-group instruction or tutoring. They also find that charters that fail to name leaders are more likely to fail, but only if that factor is combined with the charter not being part of a CMO.
Fordham can invent post-hoc rationalizations of why these particular combinations of factors make sense, but the main point is that this is all post-hoc and could well just be the result of chance. And it is important to emphasize that proposing intensive small-group instruction, being part of a CMO, or even naming a school leader on their own are not predictors of later test scores. It’s only when inter-acted and in this particular combination of variables that they arrive at statistical significance.
Even worse, the “best fit” model finds that a factor that was not a significant predictor in the straightforward analysis becomes significant when included with these other factors. So, Fordham concludes that schools proposing a child-centered pedagogy are more likely to have low performance even though the straightforward analysis did not find this. When results vary depending on atheoretical changes in model composition we call those findings unstable or not robust. But Fordham is undeterred and draws the conclusion that child-centered schools may be bad despite this instability in the result.
The truth is that this report provides no scientific evidence on factors that predict future low performance by charter schools. I know that Fordham is determined to find Signs, but in this case they are likely just seeing the chance result of an atheoretical data-mining exercise.