(Guest Post by Matthew Ladner)
The hidden highlight from the Evaluation of the DC Opportunity Scholarship Program: Impacts After Two Years report is buried in the Appendix, pp. E-1 to E-2:
Applying IV analytic methods to the experimental data from the evaluation, we find a statistically significant relationship between enrollment in a private school in year 2 and the following outcomes for groups of students and parents (table E-1):
• Reading achievement for students who applied from non-SINI schools; that is, among students from non-SINI schools, those who were enrolled in private school in year 2 scored 10.73 scale score points higher (ES = .30)^2 than those who were not in private school in year 2.
• Reading achievement for students who applied with relatively higher academic performance; the difference between those who were and were not attending private schools in year 2 was 8.36 scale score points (ES = .24).
• Parents’ perceptions of danger at their child’s school, with those whose children were enrolled in private schools in year 2 reporting 1.53 fewer areas of concern (ES = -.45) than those with children in the public schools.
• Parental satisfaction with schooling, such that, for example, parents are 20 percentage points more likely to give their child’s school a grade of A or B if the child was in a private school in year 2.
• Satisfaction with school for students who applied to the OSP from a SINI school; for example, they were 23 percentage points more likely to give their current school a grade of A or B if it was a private school.
I’m trying to figure out why the impact of actually using the voucher program isn’t actually the focus of this study, and in fact is presented in an appendix. Instead all the “mixed” results are studying the impact of having been offered a scholarship whether the student actually used it or not.
I’m going to walk way out on a limb here and predict that the impact on test scores of being offered but not using a voucher will be indistinguishable from zero. If this were a medical study, we would have a group of patients in a control and experimental group offered a drug, some of them choose not to take it, but we ignore that fact and measure the impact of the drug based on the results of both those who took it and those who didn’t. Holding the pill bottle can’t be presumed to have the same impact as taking the pills.
We’ve all been told that exercise is good for our health. Should we judge the effectiveness of exercise on health outcomes by what happens to those who actually exercise, or by the results for everyone that has been told that it is good for you?
This shortcoming has been corrected in the Appendix, but that is getting very little attention. On page 24 the evaluation reads:
Children in the treatment group who never used the OSP scholarship offered to them, or who did not use the scholarship consistently, could have remained in or transferred to a public charter school or traditional DC public school, or enrolled in a non-OSP-participating private school.
So in the report’s main discussion, the kids actually attending private schools have to make gains big enough to make up for the fact that many “treatment” kids are actually back in DCPS. As it turns out, several subsets of students do make such gains, but that’s not the point. The point is we ought to be primarily concerned with whether actual utilization of the program improves education outcomes and with systemic effects of the program. We should indeed study who actually uses this program, and who chooses not to and the reasons why (very important information), but this sort of analysis seems to belong in the appendix rather than the other way around.
Receiving an offer of a school voucher doesn’t constitute much of an education intervention, and it seems painfully obvious that the discussion around this report is conflating the impact of voucher offers with that of voucher use. The impact of voucher use is clear and positive.
I’d love to agree with you here, Matt, but I can’t. Including the kids who were offered a voucher and didn’t use it in the treatment group is the correct procedure.
What policymakers are able to do is *offer* a voucher, not force people to use it. Therefore what we care about is the impact of offering a voucher. Suppose we offered vouchers to everybody and nobody used them. That would be pretty conclusive proof that vouchers are not an effective policy. At a less extreme level, suppose the DC voucher program is poorly designed in such a way that it discourages people who are offered the voucher from using it. That ought to be reflected in the study results.
Similarly, students who aren’t offered a voucher but who went to private school anyway are included in the control group. That’s because even without vouchers, some students have access to private schools. If we’re going to measure the impact of vouchers as compared to a world without vouchers, we have to include non-voucher private school attendance in the “world without vouchers” scenario.
In short, lack of takeup by voucher winners, to the extent that it occurs, has to be considered when measuring the effectiveness of vouchers, and the availability of private schooling to some students who don’t have vouchers is part of the status quo against which vouchers should be measured.
Your two health analogies (drug trials and exercise) actually highlight the reason for doing the study this way.
Doing the study the way Pat did it is called an “intention to treat” model. You’re not only measuring the effectiveness of the treatment if the subject gets it, but the effectiveness of the intention to treat the subject.
It is important to notice that measuring intention to treat includes the effectiveness of the treatment as one factor. The other factor is whether the intention to treat translates into actually receiving the treatment. So you’re not measuring the effectiveness of intention to treat *instead of* the effectiveness of the treatment itself; you’re measuring the effectiveness of intention to treat *including* both the effectiveness of the treatment and the effectiveness of getting the treatment to the subject as a combined effect.
Now we come to the point. Health studies are not for policymakers, they’re for doctors. And doctors always have intention to treat. If you’re in the doctor’s office, it’s already decided that you’re going to get a treatment. The only remaining question is which treatment you’re going to get. Thus, what we care about is measuring the effectiveness of the treatments themselves.
In public policy, by contrast, there are countless different competing interests at stake, and not all of them are motivated by intention to treat. On the one hand, you have reformers who want kids to get a better education (i.e. intention to treat). On the other hand, you have unions that only want to keep their gravy train going and don’t care how many children’s lives they have to destroy (i.e. not so much intention to treat). And when a given treatment (e.g. vouchers) is on the table, the only *policy* decision is how much influence we’re going to give to intention to treat (i.e. enact the policy) versus other intentions. Thus, the relevant question for measuring the effectiveness of a public policy is intention to treat.
Let me put this another way. Suppose your doctor gives you pills and you don’t take them. If he finds out, he can yell at you – because you’re supposed to be taking your pills. But if you’re offered a voucher and you don’t use it, nobody can yell at you. You’re not “supposed” to be using it. It’s your choice. That means the choices people make regarding whether to use the vouchers they’re offered have to be taken into account when we measure the effectiveness of the policy.
Greg-
All of that sounds like a good reason to do an “intention to treat” analysis and then include it in an appendix. It seems clear to me that the political discussion around vouchers focuses on two questions- first do they “work,” which I think a large majority of people understand to mean “do they improve outcomes for students who use them” and second “what impact if any does a parental choice program have on students remaining in the public schools.”
An intention to treat analysis does indeed give valuable information but this analysis is being conflated into “vouchers don’t work” or “DC kids got vouchers and it didn’t make a difference” when that is clearly not the case for the kids who used them. The kids who used them transfered out of DCPS and made significant gains.
Well, those characerizations of the finding would have been wrong even if we set aside the intention to treat issue. The analysis doesn’t find that the vouchers didn’t work, it finds that the voucher students had higher test scores but we can only be 91 percent confident that this was due to vouchers and not a statistical fluke.
I’m not arguing that it doesn’t matter what you say because the media will only misrepresent it anyway. It’s true that no matter what you say the media will only misrepresent it. But it still matters what you say.
However, if being sensitive to media manipulation is the issue, consider what the media reaction would have been if the user analysis were highlighted and the intention to treat analysis relegated to a footnote. In that case, you’d have people in the IES talking on background to reporters about how the study was manipulated to produce the “right” result, because the appropriate analysis was buried in a footnote. Would that look any better?
I agree that both analyses provide useful information. The question here is which one is most relevant. And that depends on the purpose of the study; different things will be more relevant depending on your purpose.
In this case, the study was commissioned by Congress to evaluate the effectiveness of the program. Not the effectiveness of private schools for those who attend them, not the effectiveness of vouchers for those who use them, but the effectiveness of the voucher program simply as such. Lack of takeup is one thing that affects the effectiveness of the program.
As a digression, the empirical evidence supporting vouchers is so overwhelming that I’m not really worried that this study just barely failed to reach an arbitrary benchmark of statistical certainty. Anyone who wants a rundown of all the randomized studies on vouchers can find it here:
http://www.friedmanfoundation.org/friedman/newsroom/ShowNewsReleaseItem.do?id=20107
I’d love to see some IES person trying to convince a reporter “but….see…..if you include the kids who didn’t use the voucher, then the overall number goes below the standard level of significance!”
The reporters reaction is likely to be “what in the world are you talking about?” IES guy could try to wax poetic about the theory of research design, but any reporter worth their salt would bring this back to the central question “Do kids using vouchers learn more, or not?”
Right, because reporters would never buy the storyline that the Bush administration manipulates research for political ends.
The reporters would simply take dictation from their IES sources, report those claims as fact, and then for the sake of fair play they’d include a quote from our side where we say “But . . . see . . . if you . . . “
Greg is right. Matt is wrong. (Sorry, Matt)
ITT is an evaluation of a voucher program, IOT is an evaluation of private schooling. IOT is only appropriate for this if you are going to have a voucher program that mandated private schooling — but that’s not what we have or would ever argue for. ITT is an evalution of the impact of the opportunity of private schooling, which is what a voucher is. In fact, to have an absolutely true random assignment study (obviously impossible) rather than having a lottery of those who apply you would just have a lottery of everyone in a school system and let them decide to use it or not. This would be a more accurate representation of a voucher system that operates in the context of a public school environment.
My main issue is using a Bloom adjustment as the IOT rather than an IV procedure. For some reason IES seems to hate IV, but I still have no idea why. Seems to me that Bloom throws useful information away. But that’s IOT, so of secondary importance anyway.
Welcome to the blogosphere, Marcus! As Obi-Wan said, you’ve taken your first step into a larger world. And as the Borg said, resistance is futile.
Let me speak for all journalists: The reporter wants to know how the voucher-using students did compared to the students who weren’t offered a voucher. This is the easiest issue for people to understand: If you let low-income parents pick private schools, will their kids perform better? The reporter also wants to know what percentage of students didn’t use the voucher because that raises the issue of whether there are enough accessible schools that appeal to parents.
Most reporters are weak on math skills, but I don’t think you’ll find many education writers who don’t care about kids getting educated. Assume good faith and lousy math skills.
Let me speak for all nit-pickers: If you want to know the answer to the question, “If you let low-income parents pick private schools, will their kids perform better?” then the appropriate method is to leave the voucher decliners in the sample. The key word in that question is “let.”
I think you mean you want to know this: “If low-income parents actually pick private schools using a voucher, will their kids perform better?”
Meanwhile, thanks for linking to my post today on NCLB, and for paying me the ultimate compliment for an education researcher: mistaking me for Jay Greene! 🙂
Joanne has confirmed my suspcions: ITT could be the right way to answer certain questions, but it’s not the question that the vast majority of people want answered. Worse still, people think they are getting their question answered, when in fact, you have to dig into appendix E to find it.
I can’t add much to what Greg and Marcus have already rightly pointed out. The study is ultimately an evaluation of expanded choices, and everyone on the treatment group was given that. It is not an evaluation of private schooling. I will, however, question Greg’s reliance on the Borg for wisdom. Resistance is not futile. If I remember right, Piccard ultimately resisted, and the Borg are no more.
Furthermore, I can point out that the ITT analysis had the same significant subgroup findings (albeit with less magnitude) as the “hidden” IV analysis had. See pages 37-38 of the report.
What do you mean, the Borg are no more? They’re still out there. The loss of a few stray cubes here and there is of little long-term consequence to them. And someday no doubt they’ll get us all.
But of course, in another sense, resistance isn’t futile even if it only delays the conquest.