School Choice Boosts Test Scores

(Guest post by Patrick J. Wolf)

Private school choice remains a controversial education reform.  Choice programs, involving school vouchers, tax-credit scholarships, or Education Savings Accounts (ESAs), provide financial support to families who wish to access private schooling for their child.  Once declared dead in the U.S. by professional commentators such as Diane Ravitch and Greg Anrig, there are now 50 private school choice programs in 26 states plus the District of Columbia.  Well over half of the initiatives have been enacted in the past five years.  Private school choice is all the rage.

But does it work?  M. Danish Shakeel, Kaitlin Anderson, and I just released a meta-analysis of 19 “gold standard” experimental evaluations of the test-score effects of private school choice programs around the world.  The sum of the reliable evidence indicates that, on average, private school choice increases the reading scores of choice users by about 0.27 standard deviations and their math scores by 0.15 standard deviations.  These are highly statistically significant, educationally meaningful achievement gains of several months of additional learning from school choice.  The achievement benefits of private school choice appear to be somewhat larger for programs in developing countries than for those in the U.S.  Publicly-funded programs produce larger test-score gains than privately-funded ones.

The clarity of the results from our statistical meta-analysis contrasts with the fog of dispute that often surrounds discussions of the effectiveness of private school choice.  Why does our summing of the evidence identify school choice as a clear success while others have claimed that it is a failure (see here and here)?  Three factors have contributed to the muddled view regarding the effectiveness of school choice:  ideology, the limitations of individual studies, and flawed prior reviews of the evidence.

School choice programs support parents who want access to private schooling for their child.  Some people are ideologically opposed to such programs, regardless of the effects of school choice.  Other people have a vested interest in the public school system and resist the competition for students and funds that comes with private school choice.  No amount of evidence is going to change their opinion that school choice is bad.

A second source of disputes over the effectiveness of choice are the limits of each individual empirical study of school choice.  Some are non-experimental and can’t entirely rule out selection bias as a factor in their results (see here, and here).  Fortunately, over the past 20 years, some education researchers have been able to use experimental methods to evaluate privately- and publicly-funded private school choice programs.  Experimental evaluations take the complete population of students who are eligible for a choice program and motivated to use it, then employ a lottery to randomly assign some students to receive a school-choice voucher or scholarship and the rest to serve in the experimental control group.  Since only random chance, and not parental motivation, determines who gets private school choice and who doesn’t, gold standard experimental evaluations produce the most reliable evidence regarding the effectiveness of choice programs.  We limit our meta-analysis to the 19 gold standard studies of private school choice programs globally.

Each of the gold standard studies, in isolation, has certain limitations.  In the experimental evaluation of the initial DC Opportunity Scholarship Program that I led from 2004 to 2011, the number of students in testing grades dropped substantially from year 3 to year 4, leading to a much noisier estimate of the reading impacts of the program, which were positive but just missed being statistically significant with 95% confidence.  Two experimental studies of the Charlotte privately-funded scholarship program, here and here, reported clear positive effects on student test scores but were limited to just a single year after random assignment.  Two recent experimental evaluations of the Louisiana Scholarship Program found negative effects of the program on student test scores but one study was limited to just a single year of outcome data and the second one (which I am leading) has only analyzed two years of outcome data so far.  The Louisiana program, and the state itself, are unique in certain ways, as are many of the programs and locations that have been evaluated.  What are we to conclude from any of these individual studies?

Meta-analysis is an ideal approach to identifying the common effect of a policy when many rigorous but small and particular empirical studies vary in their individual conclusions.  It is a systematic and scientific way to summarize what we know about the effectiveness of a program like private school choice.  The sum of the evidence points to positive achievement effects of choice.

Finally, most of the previous reviews of the evidence on school choice have generated more fog than light, mainly because they have been arbitrary or incomplete in their selection of studies to review.  The most commonly cited school choice review, by economists Cecilia Rouse and Lisa Barrow, declares that it will focus on the evidence from existing experimental studies but then leaves out four such studies (three of which reported positive choice effects) and includes one study that was non-experimental (and found no significant effect of choice).  A more recent summary, by Epple, Romano, and Urquiola, selectively included only 48% of the empirical private school choice studies available in the research literature.  Greg Forster’s Win-Win report from 2013 is a welcome exception and gets the award for the school choice review closest to covering all of the studies that fit his inclusion criteria – 93.3%.  (Greg for the win!)

Our meta-analysis avoided all three factors that have muddied the waters on the test-score effects of private school choice.  It is a non-ideological scientific enterprise, as we followed strict meta-analytic principles such as including every experimental evaluation of choice produced to date, anywhere in the world.  Our study was accepted for presentation at competitive scientific conferences including those of the Society for Research on Education Effectiveness, the Association for Education Finance and Policy, and the Association for Policy Analysis and Management.  Our study is not limited by small sample sizes or only a few years of outcome data.  It is informed by all the evidence from all the gold standard studies.  Finally, there is nothing arbitrary or selective in our sample of experimental evaluations.  We included all of them, regardless of their findings.  When you do the math, students achieve more when they have access to private school choice.

14 Responses to School Choice Boosts Test Scores

  1. Frederick ROBERTS says:

    Changes of less than a half standard deviation are significant? Are those results after a week or month or results after a year?

    After coming from an inferior school, scores of a student who studied a year in an improved school, should be a standard deviation or more to be significant. Gains close to a full standard deviation would be interesting, but anything less than 75% of a std dev would be blah!

    Who determines that .27 and .15 std devs are significant? What nonsense is that?

    As I often thought, the modern American genius is its ability to dumb anything down, so all who pass can feel good and therefore equal in some grander sense.

    Fear that some races and classes might really be inferior seems to inspire timidity in marking to an absolute standard.

    My experience has been that races and classes are not inferior. Some members of certain races and classes may score worse, but lower scores are usually traceable to something else. Something else would include coming from a family where nobody speaks English, a chaotic family, a family whose members rarely talk to one another except to yell, a family where few read and no child is read to.

    Only if one marks children on an absolute standard appropriate for a mainstream member of a modern society will problems be tagged. If not tagged, problems cannot be fixed. As matters now stand, declaring victory over a .27- or .15-std dev gain (after one year!) amounts to sweeping problems under the carpet.

    • Frederick, you confuse statistical significance for substantive significance. By “highly statistically significant” I mean that we can be extremely confident in the direction of the effect of school choice on test scores — positive. The confidence interval around the estimate is completely above 0. Your argument is that an effect size of .15 to .27 standard deviations is not substantively significant. Compared to what? Mark Lipsey of Vanderbilt performed a meta-analysis of every rigorous evaluation of every education intervention in the U.S. from 2000 to 2006 and found that the average of the 831 total effect sizes was a gain of .05 to .07 standard deviations, depending on the level of schooling (elementary, middle, or high school). The test score gains from school choice are substantively significant because they are two to five times larger than the typical effect of an education program. Gains of .75 standard deviations are almost unheard of and limited to hot-house programs that spend 40-50k per student and can’t be scaled.

      • Greg Forster says:

        This is why I never use the word “significant,” in either sense, except when talking inside baseball with fellow scientists.

      • Unboxing Politics says:

        I know that I’m replying to this comment 4 years after the fact, but I searched for the Lipsey meta-analysis described in the post, and I did not find the same results that Dr. Wolf is describing. According to Table 9 from Lipsey’s report (https://ies.ed.gov/ncser/pubs/20133000/pdf/20133000.pdf), the mean effect size for elementary, middle, and high school interventions were 0.28, 0.33, and 0.23 respectively. The mean effect size varies by the type of test used to measure student outcomes, but even on tests with the lowest effect sizes, the mean effect size for elementary and middle schoolers were 0.8 and 0.15 respectively (high school data was unavailable). If I’m not looking at the correct meta-analysis or I’m looking at the correct meta-analysis incorrectly, feel free to let me know.

  2. Greg Forster says:

    Thanks for the kind words! One additional reason the research consensus has not received wider notice is that researchers, journalists and politicians all have powerful incentives (independent of ideology) to pretend that the research is inconclusive. But the evidence demands a verdict.

  3. […] Wolf observed in a blog post explaining the findings, the “clarity of the results… contrasts with the fog of dispute that […]

  4. Mike G says:

    Excellent.

    Question –

    a. is a gain of say, 0.2 in grade 8 more valuable/impressive/worthwhile than a gain of 0.2 in grade 1? ie, my understanding is that the annual gain in reading and math is largest in grade 1 and declines every year through grade 12 (ie, simply that more of a kid’s level is “fixed”). so a gain of 0.2 SDs in grade 8 might mean “6 more months of learning” versus a smaller amount of additional learning in the younger grades. do you generally agree here, or am i misunderstanding?

    b. if true, is it worth trying to adjust the effects from the 11 locations to account for grade level?

    • Mike,

      Yes, you are correct. A gain of .2 SD in the older grades is more impressive than a gain of .2 SD in the younger grades. Test scores hardly move at all after 10th grade.

      It would take some work but we probably should “grade adjust” our impact estimates as we move forward with the project. Thanks for the suggestion!

  5. Patrick says:

    I paid for Epple, Romano, and Urquiola study after Noah Smith (an economics professor at Stony Brook) cited the paper in a Bloomberg View oped. He used the paper as his source to claim that vouchers should be eliminated because “there is no evidence that vouchers improve grades at *any* horizon.”

    I then followed up and sent him 5 reports the Epple survey didn’t cover and pointed out that even Epple, Romano and Urquiola believe vouchers tend to provide benefits when targeted to low-income students and argued that the experiments should be continued and studied.

    Smith never responded back. Do you know if “at any horizon” is a special economic term? I’m unfamiliar with it.

    At any rate, Smith argued that school choice supporters are ideologues and rational empiricists wouldn’t support such a policy, the opposite of what his own source said. Makes me wonder if I could be a professor at Stony Brook.

    I’ll give your report a read now. Thanks.

  6. […] appear to be somewhat larger for programs in developing countries than for those in the U.S. Wolf explains, “Our meta-analysis avoided all three factors that have muddied the waters on the test-score […]

  7. […] appear to be somewhat larger for programs in developing countries than for those in the U.S. Wolf explains, “Our meta-analysis avoided all three factors that have muddied the waters on the test-score […]

  8. […] שוולף ציין ברשומה שבה הסביר את הממצאים, "הבהירות של התוצאות… […]

  9. […] and using scientifically exacting methods concluded that private school choice results in statistically significant improvements in reading and math performance, 0.27 standard deviations and 0.15 standard deviations, […]

  10. […] studies, and using scientifically exacting methods concluded that private school choice results in statistically significant improvements in reading and math performance, 0.27 standard deviations and 0.15 standard deviations, […]

Leave a comment