The Gates Effective Teaching Initiative Fails to Improve Student Outcomes

June 21, 2018

Rand has released its evaluation of the Gates Foundation’s Intensive Partnerships for Effective Teaching initiative and the results are disappointing.  As the report summary describes it, “Overall, however, the initiative did not achieve its goals for student achievement or graduation, particularly for LIM [low income minority] students.” But in traditional contract-research-speak this summary really under-states what they found.  You have to slog through the 587 pages of the report and 196 pages of the appendices to find that the results didn’t just fail to achieve goals, but generally were null to negative across a variety of outcomes.

Rand examined the Gates effort to develop new measures of teacher effectiveness and align teacher employment, compensation, and training practices to those measures of effectiveness in three school districts and a handful of charter management organizations.  According to the report, “From 2009 through 2016, total IP [Intensive Partnership] spending (i.e., expenditures that could be directly associated with the components of the IP initiative) across the seven sites was $575 million.”  In addition, Rand estimates that the cost of staff time to conduct the evaluations to measure effectiveness totaled about $73 million in 2014-15, a single year of the program.  Assuming that this staff time cost was the same across the 7 years of the program they examined, the total cost of this initiative exceeded $1 billion.  The Gates Foundation paid $212 million of this cost, with the rest being covered primarily by “site funds,” which I believe means local tax dollars.  The federal government also contributed a significant portion of the funding.

So what did we get for $1 billion?  Not much.  One outcome Rand examined was whether the initiative made schools more likely to hire effective teachers.  The study concluded:

Our analysis found little evidence that new policies related to recruitment, hiring, and new-teacher support led to sites hiring more-effective teachers. Although the site TE [teacher effectiveness] scores of newly hired teachers increased over time in some sites, these changes appear to be a result of inflation in the TE measure rather than improvements in the selection of candidates. We drew this conclusion because we did not observe changes in effectiveness as measured by study-calculated VAM scores, and we observed similar improvements in the site TE scores of more-experienced teachers.

Another outcome was the increased retention of effective teachers:

However, we found little evidence that the policies designed, in whole or in part, to improve the level of retention of effective teachers had the intended effect. The rate of retention of effective teachers did not increase over time as relevant policies were implemented (see the leftmost TE column of Table S.1). A similar analysis based only on measures of value added rather than on the site-calculated effectiveness composite reached the same conclusion (see the leftmost VAM column of Table S.1).

Did the program improve teacher effectiveness overall and specifically access by low income minority students to effective teachers?

…An analysis of the distribution of TE based on our measures of value added found that TE did not consistently improve in mathematics or reading in the three IP districts. There was very small improvement in effectiveness among mathematics teachers in HCPS [Hillsborough County] and SCS [Shelby County] and larger improvement among reading teachers in SCS, but there were also significant declines in  effectiveness among reading teachers in HCPS and PPS [Pittsburgh]. In addition, in HCPS, LIM students’ overall access to effective teaching and LIM students’ school-level access to effective teaching declined in reading and mathematics during the period of the initiative (see Table S.2). In the other districts, LIM students did not have consistently greater access to effective teaching before, during, or after the IP initiative.

And was there an overall change as a result of the program in student achievement and graduation rates?

Our analyses of student test results and graduation rates showed no evidence of widespread positive impact on student outcomes six years after the IP initiative was first funded in 2009–2010. As in previous years, there were few significant impacts across grades and subjects in the IP sites.

Here I think the report is casting a more positive spin on the results than their findings show.  Check out this summary of results from each of the sites:

I see a lot more red (significant and negative effects) than green (significant and positive). The report’s overall conclusion is technically true only because it focuses just on the last year (2014-15) and because it examines each of these 4 sites separately.  A combined analysis across sites and across time, which they don’t provide, would likely show a significant and negative overall effect on test scores.

The attainment effects are also mostly negative.  To find the attainment results at all, you have to dive into a separate appendix file.  There you will see that Pittsburgh experienced a decrease in dropout rates of between 1.3 and 3.5%, depending on the year, which is a positive result.  But Shelby County showed a significant decrease in graduation rates in every year but one.  While dropout, unlike grad rate,  is an annualized measure, the decrease in Shelby County’s graduation rate was as large as 15.7%.  The charter schools also showed a significant decrease in graduation rates as a result of the program in every year but one, with the decline as large as 6.6%.  And Hillsborough experienced a significant increase in dropout rate in one year of about 1.5%.  In three of the four sites examined there were significant, negative effects on attainment. In one site there were positive effects on attainment.

The difference in difference analysis that Rand is using is not perfect at isolating causal effects.  And as the report notes, comparison districts were also sometimes implementing similar reform strategies as the Partnership sites.  But you would expect that the injection of several hundred million dollars and considerable expert attention would improve implementation in the Partnership districts, so the comparison is still informative.  Besides, the fact that some comparison districts were pursuing some of the same reforms does not explain the splattering of red (negative and significant effects) we see.

As Mike McShane and I note in the book we recently edited on failure in education reform, there is nothing inherently wrong with trying a reform and having it fail.  The key is learning from failure so that we avoid repeating the same mistakes.  It is pretty clear that the Gates effective teaching reform effort failed pretty badly.  It cost a fortune.  It produced significant political turmoil and distracted from other, more promising efforts.  And it appears to have generally done more harm than good with respect to student achievement and attainment outcomes.

The Rand report draws at least one appropriate lesson from this experience:

A favorite saying in the educational measurement community is that one does not fatten a hog by weighing it. The IP initiative might have failed to achieve its goals because the sites were better at implementing measures of effectiveness than at using them to improve student outcomes. Contrary to the developers’ expectations, and for a variety of reasons described in the report, the sites were not able to use the information to improve the effectiveness of their existing teachers through individualized PD, CLs, or coaching and mentoring.



How Beneficial Is Pre-K?

June 16, 2018


(Guest post by Greg Forster)

That’s the question in my new policy brief from OCPA.

Today the Oklahoman ran an op-ed adapted from that policy brief:

Policymakers shouldn’t spend big money expanding pre-K when the benefits are so uncertain. They should also take pre-K off Oklahoma’s automatic-funding conveyor belt; it should have to make a case for itself like every other discretionary expense.

Moreover, Oklahoma should consider introducing school choice design in existing pre-K programs, to strengthen the freedom and power of parents. Oklahoma’s existing program permits schools to partner with community organizations; why not allow community organizations to serve parents directly?

Let me know what you think!

Genuinely Restorative Justice Must Be Strict, Not Soft

June 13, 2018

Max Eden

(Guest post by Greg Forster)

Max Eden’s outstanding piece, which Jay extols here, shows not only how lax discipline leads to bullying, chaos and death, but also how the language of “restorative justice” has been corrupted in ways that are already having terrible consequences.

Justice ought to be restorative. The purpose of justice is not revenge. It is to restore offenders to society – debt paid and ready to try again.

But the debt must be paid. With some exceptions, in general an offender is not really “restored” on a moral, psychological or social level until they have suffered just punishment. That is the only reason punishments exist. And people are ruined if they are raised up learning the lesson that there will be no consequences for bad behavior.

In other words, retributive justice is a necessary element of genuinely restorative justice.

Unfortunately, people whose goal is not to do justice but to reduce the severity of punishments have hijacked the concept of restoration. We are now trapped in a terminological system in which “restorative justice” means the opposite of “retributive justice.” People think they are helping restore kids when they are actually destroying them.

The really terrifying result of this change is not that it gives unearned rhetorical credibility to advocates of lax discipline. It is the response from the other side.

The overwhelming majority of people can see the destructiveness of lax discipline. They are therefore concluding that “restorative justice” is dangerous and destructive. Therefore they are rejecting restoration as a goal of justice. And when you do that, all that’s left is the limitless cruelty of revenge.

The increasing tendency of some to dehumanize criminals and demand harsher and harsher treatment of them cannot be fought by advocacy of lax punishments in the name of “restorative justice.” It is directly caused by advocacy of lax punishments in the name of “restorative justice.”

Only retributive justice, which affirms that punishment is not an arbitrary tool of social control but a just and necessary consequence of the crime that the criminal is morally obligated to suffer, can be effective in restraining the abuse of criminals – and promoting their genuine restoration.

As C.S. Lewis once said, I plead for retributive justice not primarily for the sake of society, or for the sake of crime victims, but for the sake of the criminal.

Max Eden May Be One of the Only Education Reporters Left, and He Isn’t Even a Reporter

June 12, 2018


Max Eden may be one of the only education reporters left, and he isn’t even a reporter. His article in The 74 today describing how changes in discipline policy led to the severe deterioration of behavior in a New York City school may be one of the best pieces of education journalism I have read in many years.  It is thoroughly documented, clearly described, and conveys a compelling and alarming story about how discipline reform may go awry.

To be clear, Max does not prove in this piece that discipline reform necessarily or even typically leads to these problems.  But that is not what journalism does.  Reporting raises issues that social science can then examine using its approach to adjudicate whether these patterns are causal and systematic.  The problem is that too many people seem to confuse journalism and social science and think that only the later should exist.

I came to this realization as I was wondering why I so rarely come across the kind of quality journalism contained in Max’s piece.  What are education reporters doing instead?  First, we unfortunately have far fewer education journalists than we used to.  Education is mostly a local story and local newspapers and their ranks of education reporters have been decimated by the rise of internet news over the last two decades.  Second, the national and often foundation-subsidized outlets we have left are often focused on advancing various agendas, whether reform-oriented or partisan, and seem to have little interest in the type of in-depth reporting contained in Max’s piece.

Third, and perhaps most alarming, is that there is a new type of education journalist who imagines him or herself as a mini-social scientist who adjudicates for us what “the research says.”  Despite having no social science training or experience conducting research, this new breed of education journalist holds forth on what the correct interpretation of the social science evidence is.  Often they do this on Twitter, which has a short format that does not allow for in-depth discussion.  Anyone can sound like an expert in a few hundred characters.

But the truth is that there is usually no simple narrative about what social science has to say and reporters are very poorly positioned to adjudicate the truth about social science.  In the past, reporters understood this and used to leave claims about what the evidence says to researchers.  Reporters who covered research saw their role as quoting competing researchers so audiences could get some understanding of the issues in dispute.

Not any more. Now this new breed of faux social scientist/reporter regularly holds forth on what the evidence tells us.  And not surprisingly, the cool kid club of social scientists whose research is affirmed by this new breed of reporter has plenty of praise to heap upon the reporter for being so smart and wise as to say that the researcher is correct. These reporters and researchers have formed a mutual admiration society.  Any criticism of either reporters or researchers in this tight circle is met with considerable outrage and re-iteration of praise for each other, typically on Twitter.

If reporters are going to start masquerading as social scientists, I suppose it is only right that others should step in and start to play the role of journalists.  The world doesn’t need (and is little influenced) by reporters pretending to be social scientists and adjudicating what the evidence says.  But the world does need and our research agenda will be influenced by the type of in-depth reporting that Max Eden has done in his new article on discipline in a New York City school.

JS Mill to Lauren Ritchie…We Don’t Need No Thought Control

June 11, 2018

(Guest Post by Matthew Ladner)

Last week saw a number of Florida media folks getting over-wrought about Science education in private schools- private schools are filled with snake-handling hillbillies who don’t teach proper science goes the meme-classic two-minute hate material. There is a large problem with this however- there is zero evidence in NAEP that private schools fail to teach science.

If you want to shut down private choice programs based on inadequate science education, you will need to shut down the public schools as well, as the available data from the NAEP demonstrates consistently lower public school science scores. Even in private schools in the south among students whose parents did not finish college 8th grade science scores look like:

This is my apprentice, Darth Fable, he will rile up the masses for a two-minute hate….

The fact that multiple Florida media outlets can get themselves lathered up over such a phantom menace however speaks to a deeper problem. In On Liberty John Stuart Mill diversity of education as being of “unspeakable importance” warning that state education would result in a “despotism of the mind” that would enforce a uniformity suiting various elites but serve society poorly:

A general State education is a mere contrivance for moulding people to be exactly like one another; and as the mould in which it casts them is that which pleases the dominant power in the government, whether this be a monarch, an aristocracy, or a majority of the existing generation; in proportion as it is efficient and successful, it establishes a despotism over the mind, leading by a natural tendency to one over the body.

Mill continued: “An education established and controlled by the State should only exist, if it exist at all, as one among many competing experiments,” Mill wrote “…carried on for the purpose of example and stimulus, to keep the others up to a certain standard of excellence.”

As both John Stuart Mill and later Milton Friedman argued, civil society should play the leading role in providing education with government restricting its role to finance. Diversity, including the preference of many parents for religious schools should operate under a broad tolerance of the wide diversity of societal preferences.

Now contrast this vision of a vibrant and tolerant education system with:

Do fundamentalists want their kids to learn a bunch of hillbilly science? Handle venomous snakes? Learn that God looks down on Catholics, that America would still have slavery except “some power-hungry individuals stirred up the people”? Knock yourself out. Just don’t expect anyone else to pay for it, and stop calling it “education.” It’s not. It’s more like a 12-year sentence to some anamorphic Sunday school class from hell with no time off for good behavior.

I turned the Periodic Table into a catchy tune…care to hear it?

So…if there is even a poorly supported notion that some differ from what JS Mill called “the mould” your reward will be poorly reasoned and supported denunciation. Note however that Florida has been an enthusiastic adopter of education opportunity and mould-breaking education in the form of the nation’s largest tax-credit, ESA, voucher, and digital learning systems, and a liberal charter law to boot. Has this provided the sort of “example and stimulus” posited by Mill? Multiple empirical studies have found improved performance in district Florida schools based on higher levels of exposure to choice options, and it looks very difficult to argue for broad aggregate harm:

We have compelling reasons not to hand Lauren Ritchie or anyone else the education mould. This sort of bigotry was unworthy of American ideals when Protestants directed it at Catholics during the Know-Nothing Era, and it is equally vile when directed at Protestants or <fill in the blank here> today, the dark sarcasm in the paper notwithstanding. Hey, Ritchie, leave those kids alone.

Two Minute Hate-Florida Science Edition in Hillbilly 3-D Satanovision!!!!!!!!!!

June 8, 2018

(Guest Post by Matthew Ladner)

Orlando Sentinel columnist Lauren Ritchie displays double-plus good duck-speaking two-minute hate in what has to be one of the most impressive displays of over-confident bias I’ve seen in many, many moons. In fact, this column is nominated for the Two-Minute Hate Hall of Fame! Some of Florida’s media have grown suspicious of science in instruction in Florida’s private choice program. Oh dear. Here’s a couple of excerpts:

Some of these schools — 80 percent describe themselves as “Christian” — use textbooks that claim people lived with dinosaurs. Heck, Noah had a couple in the ark. Some say God saved North America from Catholics and gave them South America instead. Others teach that slaves who “knew Christ” had “more freedom” than nonbelievers who weren’t captive. Babble. Just sheer babble.

The only reason these fringe “Christian schools” are getting away with sucking up millions in education funding is that Florida legislators are afraid of offending them. Elected types are so terrified of the instant howling about “Christians” being “persecuted” that they never seriously considered demanding the course of study in voucher schools meet the same standards taught in public school. They’re just happy to buy votes with millions in cash. Your tax dollars.

Folks, these are neither real schools nor, scholars will argue, are they Christian. They’re just money-making little engines for benighted fraudsters whose only other chance at a paying job is the Sears hardware department.

Do fundamentalists want their kids to learn a bunch of hillbilly science? Handle venomous snakes? Learn that God looks down on Catholics, that America would still have slavery except “some power-hungry individuals stirred up the people”? Knock yourself out. Just don’t expect anyone else to pay for it, and stop calling it “education.” It’s not. It’s more like a 12-year sentence to some anamorphic Sunday school class from hell with no time off for good behavior.

So, there is a problem with all of this. Despite a fog of name-calling (hillbilly from hell!) and questioning of motives (profiteering hillbillies from hell!) Ritchie does not present a whit of evidence that private school students learn less science than public school students.

Fortunately, the National Center for Education Statistics runs a project known as the National Assessment of Educational Progress (NAEP) which can shed light on this subject. Also known as the “Nation’s Report Card” this source has a huge amount of highly respected academic data, including some on Science, in private schools, among parents without college degrees.

NAEP does not provide private school science scores by individual state, but it does by region. We will therefore focus on the South, and the private school scores of students whose parents did not graduate from college. If you are going to find flat-earther profiteering red-neck science with Dinosaurs on Noah’s Ark, or whatever it is Ritchie imagines, this ought to be the place to look.

NAEP tracks four types of Science achievement: Earth Science, Physical Science, Life Science and Overall Science. The 2009 8th grade science exam allowed us to look exactly at what Ritchie denounces: students whose parents did not graduate from college, who attend private schools located in Southern states. How did these kids perform in the various NAEP Science categories compared to their public school peers?


You see the same pattern by the way if you compare southern scores for students whose parents did graduate from college in public and private schools in the South. Or if you do these comparisons in Western, Northeast, or Midwestern students. Across four measures of science and four regions of the country and separately for non-college and college educated parents (32 comparisons in all) private school students demonstrated a higher score a mere 32 times out of 32. Most relevant to our present discussion however are the four comparisons in the above chart, among southern students without college educated parents.

Since the author brought up Protestant schools (snake-handling!!!!) it is worth noting that only 15% of Florida tax credit students attend Catholic schools. Unlike other regions of the country, the denominational distribution in American southern states does not lend itself the private school sectors dominated by Catholic schools to the extent seen in other parts of the country. It’s also worth noting that as the state with the (by far) the nation’s largest tax credit program, the nation’s largest voucher program when these data were generated (2009) we can safely assume that Florida kids are over-represented in the Southern sample. Back in 2009, southern private choice programs were few and far between in number and small in size outside of Florida.

In recent years changes in the federal free and reduced lunch program has made it increasingly unreliable as a proxy for family income. This is all the worse among private schools, who often have difficulty accessing the program. We will therefore make use of parental education as an imperfect proxy for family income. Nevertheless, when hunting for hillbilly kids with snake-handling parents suckered by profiteering private school swindlers being taught that the Sun rotates around the Earth in science class, you would be hard-pressed to find a better place to look.

NAEP has private school scores for 8th grade Science from 2011, 2009, 2000 and 1996. The 2011 figures provide no numbers for students whose parents did not college graduate parents attending private school by region, but did provide numbers for college graduate parents in both public and private schools. Private school students had higher scores in three of four regions (one point advantage for public schools in the Northeast). The 2000 and 1996 data do not include a regional variable, but in the national comparisons both students whose parents did not graduate from college attending private schools and college grad private school parents had higher scores than their public school peers in both 2000 and 1996. So I think this basically puts the score at 39 to 1.

Comparisons like the one above do not prove that private schools are better at teaching science than public schools in any way, shape or form. They do however cast substantial doubt on the notion that they do a hugely worse job at teaching science than public schools. If private schools are running profiteering-hillbilly-from-hell science classes, what pray-tell is going on in the public schools?

Texas Charter Schools Enroll 85% Minority Students and they CRUSHED the 2017 NAEP

June 4, 2018

(Guest Post by Matthew Ladner)

The Nation’s Report Card (aka the National Assessment of Educational Progress or NAEP) released new student achievement data in April for Math and Reading exams given in 2017. Nationwide, the news was not good, and the same was generally true for scores in Texas. Figure 1 shows NAEP math and reading gains for 8th grade students since 2009 for states.

Texas 8th graders were scoring three points higher on math than 8th graders in 2009, but five points lower in reading. On these tests, 10 points is approximately equal to a grade level worth of average academic progress. Overall, Texas students failed to show much academic improvement during this period.

While Texas districts have floundered, Texas charter schools have flourished academically. Figure 2 presents the same information from states but includes the progress for Texas charter school students along with the statewide averages. At 315,200 students during the 2016-17 school year, Texas charter schools serves more students than 13 state public education systems.

The improvement in scores among Texas charter school students greatly exceed those of any state. Gains however are not the only consideration. Some states like Massachusetts failed to show large academic gains but had high scores in both 2009 and 2017. International comparisons show Massachusetts compares favorably to the top European and Asian school systems, so there is no shame in holding your mud with high scores. The next chart therefore plots 8th grade math gains (from 2009 to 2017) with overall scores (in 2017) for states and Texas charter students to take both improvement over time and overall level of achievement into account:

Texas charter students not only had higher gains than any state, they also demonstrated higher overall scores than most states. Each NAEP exam utilizes a new random sample of students, and the “sampling error” for subgroups exceeds that for states. Such sampling error however should be randomly distributed, meaning that Texas charter scores/gains could be either larger or smaller. The Reading exam however provides an entirely separate sample of students from the math exam, presented in Figure 4 below:

Texas charter students again show both larger gains than any state, and relatively impressive scores, especially when considering student demographics-leading us to our next subject.

Scores and gains this impressive naturally lead to the question of whether changes in student composition drive them. Only a random assignment study can definitively isolate the role of school quality in driving scores and gains, and such studies are impractical for statewide assessments. A Stanford University study utilizing state academic data from 2011 to 2015 using a student matching strategy found evidence of stronger academic growth for Hispanic students attending Texas charters after controlling for a variety of factors.

The demographic distribution of Texas charter schools stood at more than 85% minority in 2016-17 after having become increasingly majority-minority over the previous decade. Hispanics constituted 60 percent of Texas charter students in 2016-17. A comparison of Hispanic 8th grade math scores that accounted for both parental education and special program status (English Language Learners and Special Education) found that Hispanic students attending Texas charter schools outscored all statewide averages for Hispanic students.

This would have to be the case in order to make this happen:

NAEP can’t isolate the role of average school quality in these impressive scores. When however an 85% minority school sector out scores Vermont I’m cautiously optimistic that school quality had something to do with it.  Texas adds ~90,000 new K-12 students per year and the districts seem to be struggling both academically and financially in keeping up with the growth. Expanded opportunities for families to choose the sort of education to suit their needs and aspirations could help address both concerns.

%d bloggers like this: