Understanding the Gates Foundation’s Measuring Effective Teachers Project

January 9, 2013

If I were running a school I’d probably want to evaluate teachers using a mixture of student test score gains, classroom observations, and feedback from parents, students, and other staff.  But I recognize that different schools have different missions and styles that can best be assessed using different methods.  I wouldn’t want to impose on all schools in a state or the nation a single, mechanistic system for evaluating teachers since that is likely to be a one size fits none solution.  There is no single best way to evaluate teachers, just like there is no single best way to educate students.

But the folks at the Gates Foundation, afflicted with PLDD, don’t see things this way.  They’ve been working with politicians in Illinois, Los Angeles, and elsewhere to centrally impose teacher evaluation systems, but they’ve encountered stiff resistance.  In particular, they’ve noticed that teachers and others have expressed strong reservations about any evaluation system that relies too heavily on student test scores.

So the folks at Gates have been trying to scientifically validate a teacher evaluation system that involves a mix of test score gains, classroom observations, and student surveys so that they can overcome resistance to centrally imposed, mechanistic evaluation systems.  If they can reduce reliance on test scores in that system while still carrying the endorsement of “science,” the Gates folk imagine  that politicians, educators, and others will all embrace the Gates central planning fantasy.

Let’s leave aside for the moment the political reality, demonstrated recently in Chicago and Los Angeles, that teachers are likely to fiercely resist any centrally imposed, mechanistic evaluation system regardless of the extent to which it relies on test scores.  The Gates folks want to put on their lab coats and throw the authority of science behind a particular approach to teacher evaluation.  If you oppose it you might as well deny global warming.  Science has spoken.

So it is no accident that the release of the third and final round of reports from the Gates Foundation’s Measuring Effective Teachers project was greeted with the following headline in the Washington Post: “Gates Foundation study: We’ve figured out what makes a good teacher,”  or this similarly humble claim in the Denver Post: “Denver schools, Gates foundation identify what makes effective teacher.”  This is the reaction that the Gates Foundation was going for — we’ve used science to discover the correct formula for evaluating teachers.  And by implication, we now know how to train and improve teachers by using the scientifically validated methods of teaching.

The only problem is that things didn’t work out as the Gates folks had planned.  Classroom observations make virtually no independent contribution to the predictive power of a teacher evaluation system.  You have to dig to find this, but it’s right there in Table 1 on page 10 of one of the technical reports released yesterday.  In a regression to predict student test score gains using out of sample test score gains for the same teacher, student survey results, and classroom observations, there is virtually no relationship between test score gains and either classroom observations or student survey results.  In only 3 of the 8 models presented is there any statistically significant relationship between either classroom observations or student surveys and test score gains (I’m excluding the 2 instances were they report p < .1 as statistically significant).  And in all 8 models the point estimates suggest that a standard deviation improvement in classroom observation or student survey results is associated with less than a .1 standard deviation increase in test score gains.

Not surprisingly, a composite teacher evaluation measure that mixes classroom observations and student survey results with test score gains is generally no better and sometimes much worse at predicting out of sample test score gains.  The Gates folks trumpet the finding that the combined measures are more “reliable” but that only means that they are less variable, not any more predictive.

But “the best mix” according to the “policy and practitioner brief” is “a composite with weights between 33 percent and 50 percent assigned to state test scores.”  How do they know this is the “best mix?”  It generally isn’t any better at predicting test score gains.  And to collect the classroom observations involves an enormous expense and hassle.  To get the measure as “reliable” as they did without sacrificing too much predictive power, the Gates team had to observe each teacher at least four different times by at least two different coders, including one coder outside of the school.  To observe 3.2 million public school teachers for four hours by staff compensated at $40 per hour would cost more than $500 million each year.  The Gates people also had to train the observers at least 17 hours and even after that had to throw out almost a quarter of those observers as unreliable.  To do all of this might cost about $1 billion each year.

And what would we get for this billion?  Well, we might get more consistent teacher evaluation scores, but we’d get basically no improvement in the identification of effective teachers.  And that’s the “best mix?”  Best for what?  It’s best for the political packaging of a centrally imposed, mechanistic teacher evaluation system, which is what this is all really about.  Vicki Phillips, who heads the Gates education efforts, captured in this comment what I think they are really going for with a composite evaluation score:

Combining all three measures into a properly weighted index, however, produced a result “teachers can trust,” said Vicki Phillips, a director in the education program at the Gates Foundation.

It’ll cost a fortune, it doesn’t improve the identification of effective teachers, but we need to do it to overcome resistance from teachers and others.  Not only will this not work, but in spinning the research as they have, the Gates Foundation is clearly distorting the straightforward interpretation of their findings: a mechanistic system of classroom observation provides virtually nothing for its enormous cost and hassle.  Oh, and this is the case when no stakes were attached to the classroom observations.  Once we attach all of this to pay or continued employment, their classroom observation system will only get worse.

I should add that if classroom observations aren’t useful as predictors, they also can’t be used effectively for diagnostic purposes.  An earlier promise of this project is that they would figure out which teacher evaluation rubrics were best and which sub-components of those rubrics that were most predictive of effective teaching.  But that clearly hasn’t panned out.  In the new reports I can’t find anything about the diagnostic potential of classroom observations, which is not surprising since those observations are not predictive.

So, rather than having “figured out what makes a good teacher” the Gates Foundation has learned very little in this project about effective teaching practices.  The project was an expensive flop.  Let’s not compound the error by adopting this expensive flop as the basis for centrally imposed, mechanistic teacher evaluation systems nationwide.

(Edited for typos and to add links.  To see a follow-up post, click here.)


How the Gates Foundation Spins its Research

January 7, 2012

The Gates Foundation has released the next installment of reports in their Measuring Effective Teachers Project.  When the last report was released, I found myself in a tussle with the Gates folks and Sam Dillon at the New York Times because I noted that the study’s results didn’t actually support the finding attributed to it.  Vicki Phillips, the education chief at Gates,  told the NYT and LA Times that the study showed that “drill and kill” and “teaching to the test” hurt student achievement when the study actually found no such thing.

With the latest round of reports, the Gates folks are back to their old game of spinning their results to push policy recommendations that are actually unsupported by the data.  The main message emphasized in the new round of reports is that we need multiple measures of teacher effectiveness, not just value-added measures derived from student test scores, to make reliable and valid predictions about how effective different teachers are at improving student learning.

This is the clear thrust of the newly released Policy and Practice Brief  and Research Paper and is obviously what the reporters are being told by the Gates media people.  For example, Education Week summarizes the report as follows:

…the study indicates that the gauges that appear to make the most finely grained distinctions of teacher performance are those that incorporate many different types of information, not those that are exclusively based on test scores.

And Ed Sector says:

The findings demonstrate the importance of multiple measures of teacher evaluation: combining observation scores, student achievement gains, and student feedback provided the most reliable and predictive assessment of a teacher’s effectiveness.

But buried away on p. 51 of the Research Paper in Table 16 we see that value-added measures based on student test results — by themselves — are essentially as good or better than the much more expensive and cumbersome method of combining them with student surveys and classroom observations when it comes to predicting the effectiveness of teachers.  That is, the new Gates study actually finds that multiple measures are largely a waste of time and money when it comes to predicting the effectiveness of teachers at raising student scores in math and reading.

According to Table 16, student achievement gains correlate with the underlying value-added by teachers at .69. If the test scores are combined (with an equal weighting) with the results of a student survey and classroom observations that rate teachers according to a variety of commonly-used methods, the correlation to underlying value-added drops to be between .57 and .61.  That is, combining test scores with other measures where all measures are equally weighted actually reduces reliability.

The researchers also present the results of a criteria weighted combination of student achievement gains, student surveys, and classroom observations based on the regression coefficients of how predictive each is of student learning growth in other sections for the same teacher.  Based on this the test score gains are weighted at .729, the student survey at .179, and the classroom observations at .092.  This tells us how much more predictive test score gains are than student surveys or classroom observations.  Yet even when test score gains constitute 72.9% of the combined measure, the correlation to underlying teacher quality still ranges between .66 and .72, depending on which method is used for rating the classroom observations.  The criteria-weighted combined measure provides basically no improvement in reliability over using test score gains by themselves.

And using multiple measures does not improve our ability to distinguish between effective and ineffective teachers.  Using test scores alone the difference between the top quartile and bottom quartile teacher in producing  student value-added is .24 standard deviations in math learning growth on the state test.  If we combine test scores with student surveys and classroom observations using an equal weighting, the difference between top and bottom quartile teachers shrinks to be between .19 and .21.  If we use the criteria weights, where test scores are 72.9% of the combined measure, the gap between top and bottom teacher ranges between .22 and .25.  In short, using multiple measures does not improve our ability to distinguish between effective and ineffective teachers.

The same basic pattern of results holds true for reading, which can be seen in Table 20 on p. 55 of the report.  Combining test score measures of teacher effectiveness with student surveys and classroom observations does improve a little our ability to predict how students would answer survey items about their effort in schools as well as how they felt about their classroom environment.  But unlike test scores, which have been shown to be strong predictors of later life outcomes, I have no idea whether these survey items accurately capture what they intend or have any importance for students’ lives.

Adding the student surveys and classroom observation measures to test scores yields almost no benefits, but it adds an enormous amount of cost and effort to a system for measuring teacher effectiveness.  To get the classroom observations to be usable, the Gates researchers had to have four independent observations of those classrooms by four separate people.  If put into practice in schools that would consume an enormous amount of time and money.  In addition, administering, scoring, and combing the student survey also has real costs.

So, why are the Gates folks saying that their research shows the benefits of multiple measures of teacher effectiveness when their research actually suggests virtually no benefits to combining other measures with test scores and when there are significant costs to adding those other measures?  The simple answer is politics.  Large numbers of educators and a segment of the population find relying solely on test scores for measuring teacher effectiveness to be unpalatable, but they might tolerate a system that combined test scores with classroom observations and other measures.  Rather than using their research to explain that these common preferences for multiple measures are inconsistent with the evidence, the Gates folks want to appease this constituency so that they can put a formal system of systematically measuring teacher effectiveness in place.  The research is being spun to serve a policy agenda.

This spinning of the findings  is not just an accident or the results of a misunderstanding.  It is clearly deliberate.  Throughout the two reports Gates just released, they regularly engage in the same pattern of presenting the information. They show that the classroom observation measures by themselves have weak reliability and validity in predicting effective teachers.  But if you add the student survey and then add the test score measures, you get much better measures of effective teachers.  This pattern of presentation suggests the importance of multiple measures, since the classroom observations are strengthened when other measures are added.  The only place you find the reliability and validity of test scores by themselves is at the bottom of the Research Paper in Tables 16 and 20.  If both the lay-version and technical reports had always shown how little test scores are improved by adding student surveys and classroom observations, it would be plain that test scores alone are just about as good as multiple measures.

The Gates folks never actually inaccurately describe their results (as Vicki Phillips did with the previous report).  But they are careful to frame the findings as consistently as possible with the Gates policy agenda of pushing a formal system of measuring teacher effectiveness that involves multiple measures.  And it worked, since the reporters are repeating this inaccurate spin of their findings.


(UPDATE — For a post anticipating responses from Gates, see here.)

Gates Foundation — Release the MET Results

October 25, 2011

A sketch of the $500 million new Gates Foundation headquarters

Bill and Melinda Gates mentioned again in the Wall Street Journal the Measuring Effective Teachers (MET) project that their foundation is orchestrating.  Bill and Melinda may want to check on the status of the MET research they’ve been touting since full results were promised in the spring of 2011 and have yet to be released.

Just to review… In an earlier interview with the Journal, MET was described as follows:

the Gates Foundation’s five-year, $335-million project examines whether aspects of effective teaching, classroom management, clear objectives, diagnosing and correcting common student errors can be systematically measured. The effort involves collecting and studying videos of more than 13,000 lessons taught by 3,000 elementary school teachers in seven urban school districts.

The motivation, re-iterated in the new piece by Bill and Melinda Gates is to identify  what “works” in classroom teaching to develop systems that train and encourage other teachers to imitate those practices:

It may surprise you—it was certainly surprising to us—but the field of education doesn’t know very much at all about effective teaching. We have all known terrific teachers. You watch them at work for 10 minutes and you can tell how thoroughly they’ve mastered the craft. But nobody has been able to identify what, precisely, makes them so outstanding….

The intermediate goal of MET is to discover what we are able to measure that is predictive of student success. The end goal is to have a better sense of what makes teaching work so that school districts can start to hire, train and promote based on meaningful standards.

As I’ve argued before, using research to identify “best practices” in teaching only makes sense if the same teaching approaches would be desirable for the vast majority of teachers and students, regardless of the context.  And as I’ve also  suggested before, I don’t believe this effort is likely to yield much in education.  Effective teaching is like effective parenting — it is highly dependent on the circumstances.  Yes, there are some parenting (and teaching) techniques that are generally effective for almost everyone, but those are mostly known and already in use.

This doesn’t mean we are completely unable to measure effective teaching (or parenting).  It just means that we have to judge it by the results and cannot easily make universal statements about the right methods for producing those results.  To make a sports analogy, there is no single “best practice” for hitters in baseball.  There are a variety of stances and swings.  The best way to judge an effective hitter is by the results, not by the stance or swing.  And if we tried to make all hitters stand and swing in the same way, we’d make a lot of them worse hitters.

It is because of this heterogeneity in effective teaching practices that I think the MET project is doomed to disappoint.  And according to inside sources, I’ve heard that results are being delayed because they are failing to produce much of anything.

According to the MET web site, the full results for the 1st year should have been released in the spring:

 In spring 2011, the project will release full results from the first year of the study, including predictors of teaching effectiveness and correlation with value-added assessments.

It is almost November and we have not seen these results.  I understand that in very large and complicated projects, like MET, things can take much longer than originally planned.  If so, it would be nice to hear that explanation.  It would be even nicer if the Gates Foundation released results if they have them, even if those results were not what they had hoped they would find.

Some inquisitive reporters should start asking Gates officials and members of the research team about the status of the MET results.  Reporters should go beyond talking to the media flacks at Gates HQ and actually talk to individual members of the team confidentially.  If they do that, they may confirm what I have been hearing: MET results have been delayed because they aren’t panning out.

(UPDATE:  Gates responds.

The Gates Foundation and the Rise of the Cool Kids

July 28, 2011

(Guest Post by Matthew Ladner)

Jay and Greg have been carrying on an important discussion concerning the Gates Foundation and education reform. I wanted to add a few thoughts.

Rick Hess and others have noted the “philanthropist as royalty” phenomenon in the past. Any philanthropist runs the danger of only hearing what they want to hear from their supplicants, and Gates as the largest private foundation runs the biggest risk. The criticism of the Gates Foundation I had seen in the past emanated from the K-12 reactionary fever swamp, hardly qualifying as constructive.

The challenge faced by philanthropists: how do you challenge your own assumptions and evaluate your own efforts honestly? Do you hire formidable Devil’s advocates to level their most skeptical case against your efforts?

I don’t know the answer to these questions, just that if I were Bill Gates I would be terrified of everyone telling me how right my thinking is because they want my money. This is however the best sort of problem to have…

Jay’s central critique of the Gates Foundation strategy seems to be that they have put too much faith in a centralized command and control strategy. They would be wise to entertain this thought. If command and control alone were the solution, then we wouldn’t have education problems-district, state and federal governance have all failed to prevent widespread academic failure for decades.

The Gates strategy does however embrace decentralization. Over the years they have supported charter schools, and fiercely opposed the worst one-size fits all policy of all: salary schedules and automatic/irrevocable tenure. Riley’s WSJ article makes clear that Gates understands the benefits of private school choice, but that he falls for the Jay Mathews fallacy of thinking it is just too politically difficult.

Sigh…perhaps next year Greg can make a dinner bet with Bill.

Gates is also the primary backer of Khan Academy. This new article on Sal Khan in Wired magazine makes clear that Khan understands the danger of being swallowed by school systems and that he is not going to allow it to happen. Khan academy is both radically decentralized and is in the early stages of being used by people within the centralized school system to improve outcomes.

Whatever the mistakes to date, the Gates Foundation has in my mind has succeeded in serving as a counter-weight to the NEA, mostly through funding the efforts of a myriad network of reform organizations collectively known as the Cool Kids. Today, there is a struggle for power going on within the Democratic Party over K-12 policy and the Gates Foundation deserves some credit in my mind for supporting  the ideas behind the “Democrat Spring” on education policy. This spring is following more of the Syrian than the Egyptian model thus far, but it is happening, and it is very important.

Does that mean that they are the “good guys” and Jay should lay off of them? Of course not-reasoned critiques of large philanthropists are in short supply for all of the factors cited above. Jason Riley wished that Gates were bolder in embracing decentralization reforms, but noted that in the end that it was the Gates rather than the Riley Foundation. This is absolutely true, but it doesn’t make the royalty problem go away, and leaves a continuous question of how the emperor gets feedback on his new clothes.

I don’t agree with the Cool Kids about everything. The next time I hear someone ask a question about having Common Core replace NAEP (the very pinnacle of naive folly) for instance I may pull out entire tufts of my graying, thinning hair in utter exasperation. Reformers of all stripes need to be on guard against the ship-wheel conceit, which is to imagine that if only my strong hands steered the ship, we’d sail through the rocky shoals of ed reform without a hitch.

The East Germans ran a much better economy than the North Koreans, much to the benefit of Germans and to the detriment of Koreans. This is real and important in human terms- I do not make this point glibly. I never heard about an East German famine decimating the population, but food shortages have even soldiers starving to death in North Korea (pity the women and children). Better quality management is good and desirable, but…it will only take you so far. Today, Chinese apparatchiks are noisily crediting themselves for the tremendous economic progress in China without the slightest hint of irony. Without the market forces Deng introduced and with more apparatchiks, China would revert back to a starving backwater. With fewer apparatchiks, her progress would almost certainly accelerate.

As Sara Mead correctly noted in this guest post at Eduwonk, today’s education debate largely involves a mixture of technocratic and market-based reforms (neo-liberals) on one side and a group of reactionaries lacking realistic solutions on the other. A third of our 4th graders can’t read and have been shoved into the dropout pipeline. We need both technocratic and market based reforms, and we need stronger reforms of both sorts than those fielded to date.

Jay’s critique concerns the right mix of reforms within the bounds of the neo-liberal consensus. This of course is a matter of debate, and debate is the path to deeper understanding. The sheer size of the Gates Foundation has the potential to stifle such debate as it relates to their efforts, even passively, and reformers should recognize the danger in allowing it to do so. This isn’t about them so much as it is about us.

Gates Foundation Follies (Part 2)

July 26, 2011

Image result for gates foundation headquarters

A sketch of the $500 million new Gates Foundation headquarters

In Part 1 of this post, I described how the Gates Foundation came to recognize the importance of using political influence to reform the education system rather than focusing on reforming one school at a time in the hopes that school systems would see and replicate successful models.  No private philanthropist has enough money to buy and sustain widespread adoption of an effective approach and the public school system has little incentive to identify and spread effective approaches on their own.

Faced with the unwillingness of the public school system to reproduce successful models (assuming that Gates could even offer one), the Foundation was left with two solutions to encourage innovation: 1) identify the best practices themselves and impose them from the top down, or 2) encourage choice and competition so that schools would have the proper incentive to identify, imitate, and properly implement effective approaches.

The Gates Foundation made the wrong choice.  Their top-down strategy cannot work for the following reasons:

1) Education does not lend itself to a single “best” approach, so the Gates effort to use science to discover best practices is unable to yield much productive fruit;

As I’ve explained before, there are many different “best” techniques for different kinds of teachers with different kinds of students in different situations with different available resources.  There are some practices that are universally beneficial in education, but they tend to be pretty obvious and are already well known (e.g. it is bad to beat kids, it is better when teachers know the material they are teaching, it is helpful to break down ideas into their essential components, etc…).

The difficulty of discovering universally beneficial  practices that are not already well-known, especially with the blunt tools available to researchers probably helps explain why the Measuring Effective Teachers (MET) project, on which the Gates Foundation is spending $335 million has yet to produce any meaningful results despite entering its third year of operation.

2) As a result, the Gates folks have mostly been falsely invoking science to advance practices and policies they prefer for which they have no scientific support;

Despite having nothing to show for the $335 million they are spending on MET, the Gates folks nevertheless claim that it “proves” the harmfulness of teachers engaging in “drill and kill.” The fact that the research showed no such thing did not deter them from telling the NY Times and LA Times that it did.  Even when I pointed out the error, the Gates folks refused to issue a correction (although the LA Times ran one on their own).

Similarly, the Gates-orchestrated effort to push national standards, curricular materials, and assessments is advancing without any scientific evidence of the desirability of these approaches.  Gathering a group of Checker Finn’s friends (er, I mean, “a panel of experts”) to attest that the Common Core standards are better is not science.  It is the false invocation of science to manipulate people into compliance with their agenda.

3) Attempting to impose particular practices on the nation’s education system is generating more political resistance than even the Gates Foundation can overcome, despite their focus on political influence and their devotion of significant resources to that effort;

Opponents of centralized control of education have begun to mobilize against the Gates-orchestrated effort to establish national standards, curricular materials, and assessments.  But the bulk of the political resistance to the Gates strategy will come from the teacher unions.  They don’t want anyone to infringe on their autonomy or place their interests in jeopardy with a nationalized accountability system.  They may play along with Gates for a while and take their money, but when push comes to shove the unions can only tolerate one dictator in education — the unions.  Of course, those of us who don’t want anyone centrally-controlling the nation’s education system will oppose both Gates and the teacher unions.

We already have a taste of the kind of resistance teacher unions will put up against the Gates nationalization effort in the slogans emanating from Diane Ravitch and Valerie Strauss’ Twitter feed, supported by their Army of Angry Teachers.  Falsely claiming that MET proved that drill and kill is harmful did not mollify these folks at all.

The teacher unions derive far more power and money from the status quo than Gates can ever offer them, unless of course Gates builds a nationalized system and cedes control to the unions, which is not part of the Gates plan.  Nothing in the Gates strategy weakens the unions and would force them to make significant concessions, so in the end the unions will either hijack the Gates strategy for their own benefit or block it.  Even Gates does not have the resources to beat the unions without first diminishing their power.

4) The scale of the political effort required by the Gates strategy of imposing “best” practices is forcing Gates to expand its staffing to levels where it is being paralyzed by its own administrative bloat; 

Over the last decade the Gates Foundation has roughly doubled its assets but increased its staffing by about 10-fold.  The Foundation is now huge, which is part of why it needs the Education Pentagon pictured above to house everyone.  The Foundation has gotten huge because it is trying to buy political influence as it buys people.  Gates has been snapping up or funding just about every advocacy group, researcher, or education journalist they can find.  Getting all of these people on board for a nationalized education system (or at least mute their dissent) involves paying an enormous number of people and organizations.

Gates can buy a lot of folks, but they can’t buy everyone and they can’t keep the folks they do pay in line for very long.  It’s like herding cats. (I should note that I’ve received Gates Funding in the past).

And the sheer size of their staff and funded allies along with the focus on controlling the political message is so overwhelming that it is significantly hindering their ability to do anything.  People inside the organization have told me that they are suffering from a bureaucratic gridlock with endless meetings, conference calls, and chains of approvals.  Notice that Gates is paying a ton of researchers and yet virtually no research is coming out.  Very curious.

5) The false invocation of science as a political tool to advance policies and practices not actually supported by scientific evidence is producing intellectual corruption among the staff and researchers associated with Gates, which will undermine their long-term credibility and influence.

As noted above, the need to advance a particular political message has led Gates to mischaracterize their own research (for example, claiming that MET proves that drill and kill is harmful when the research does not show that).  But the intellectual corruption extends much farther.  I had a highly respected and accomplished researcher employed by Gates tell me that Vicki Phillips’ mischaracterization of the MET results was not so far off because there isn’t a big difference between a low correlation and a negative one.  He also defended comparing the magnitude of a series of pair-wise correlations to determine the relative influence of different variables.  To hear someone who knows better twist the truth to avoid contradicting the education boss at Gates was just sad.

Unfortunately, too many advocates, researchers, and others are being similarly corrupted.  In most cases the Gates folks don’t have to exert any explicit pressure on people to keep them in line; they just anticipate what they think would serve the Gates strategy.  But I am aware of at least one case in which a researcher’s findings were at odds with the desired outcome and that person suffered for it.

I’ve heard another story from someone involved in the MET project that the delay in releasing any results from the analyses of classroom videos even as the project enters its third year is explained by their inability to find any meaningful results.  Perhaps another year of data will make something turn up that they can finally tout for their $335 million investment.  The fact that the initial MET report with basically no useful findings was released on a Friday just before Christmas suggests that the Gates folks are working hard to shape their message.

The national standards, curriculum, and testing campaign is rife with intellectual corruption.  For example, people are twisting themselves into knots to explain how the effort is purely voluntary on the part of states when it is manifestly not, given federal financial “incentives,” offers of selective exemptions to NCLB requirements for states that comply, and the threat of future mandates.  There is so much spin around Gates that it makes one dizzy.


Let me be clear, most of the folks affiliated with Gates are good and smart people.  The problem is that when your reform strategy requires a top-down approach, these good and smart people are put under a lot of stress to have a unified vision of the “best” that will be imposed from the top.  And whenever an organization starts sprinkling millions of dollars on researchers and advocacy groups unaccustomed to that kind of money, there are temptations that are hard for the most virtuous to resist.

But the good and smart people at Gates can stop the counter-productive strategy that the Foundation is pursuing.  The Foundation changed course once before and it can do it again.


UPDATE — For my suggestions of what the Gates Foundation could do instead, see this post.

Gates Foundation Follies (Part 1)

July 25, 2011

Image result for gates foundation headquarters

A sketch of the $500 million new Gates Foundation headquarters

Jason Riley’s interview with Bill Gates in the Wall Street Journal was not as great as Riley’s interview with me last week (shameless plug for my new mini-book), but it was still very illuminating.  In particular, the Gates interview confirmed two things about the Foundation’s education efforts: 1) they’ve realized that the focus of their efforts has to be on the political control of schools and 2) they are uninterested in using that political influence to advance market forces in education. Instead, the basic strategy of the Gates Foundation is to use science (or, more accurately, the appearance of science) to identify the “best” educational practices and then use political influence to create a system of national standards, curricular materials, and testing to impose those “best practices” on schools nationwide.

The Gates Foundation came to understand the necessity of political influence over schools with the failure of their previous small schools strategy.  Under that strategy they tried to achieve reform by paying school districts to break-up larger high schools into smaller ones.  The problem with that strategy is that even the Gates Foundation does not have nearly enough money to buy systemic reform one school at a time.

School districts currently spend over $600 billion per year and the Gates Foundation only has $34 billion in total assets.  With the practice of spending only about 5% of assets each year and given the large (and effective) efforts the Foundation makes in developing country health-care, Gates only spends a couple hundred million dollars on education reform each year. Given the small share of total education spending Gates could offer, most public districts refused to entertain the Gates strategy of smaller schools, others took the money but failed to implement it properly, and others reversef the reform once the Gates subsidies ended.

The way I described the situation in my chapter “Buckets into the Sea” in the 2005 book, With the Best of Intentions, edited by Rick Hess is:

Philanthropists simply don’t have enough resource to reshape the education system on their own; all their giving put together amounts to only a tiny fraction of total education spending, so their dollars alone can’t make a significant difference.  In order to make a real difference, philanthropists must support programs that redirect how future public education dollars are spent.

And in 2008 I repeated this claim, saying: “total private giving to public education is a tiny portion of total spending on schools.  All giving, from the bake sale to the Gates Foundation, makes up less than one-third of 1% of total spending.  It’s basically rounding error.”

I don’t know whether the Gates Foundation was influence by my writing or whether they arrived at the same conclusions independently, but they are now articulating those same conclusions, often with the same exact words:

“It’s worth remembering that $600 billion a year is spent by various government entities on education, and all the philanthropy that’s ever been spent on this space is not going to add up to $10 billion. So it’s truly a rounding error.”

This understanding of just how little influence seemingly large donations can have has led the foundation to rethink its focus in recent years. Instead of trying to buy systemic reform with school-level investments, a new goal is to leverage private money in a way that redirects how public education dollars are spent.

While the focus of the Gates Foundation on influencing education policy is sensible, the particular political approach they have chosen is doomed to fail and attempting it is likely to be counter-productive.  In Part 2 of this post I will explain how the new strategy Gates has decided to pursue is flawed.

To give you a taste of what is coming in Part 2, the arguments can be summarized as: 1) Education does not lend itself to a single “best” approach, so the Gates effort to use science to discover best practices is unable to yield much productive fruit; 2) As a result, the Gates folks have mostly been falsely invoking science to advance practices and policies they prefer for which they have no scientific support; 3) Attempting to impose particular practices on the nation’s education system is generating more political resistance than even the Gates Foundation can overcome, despite their focus on political influence and their devotion of significant resources to that effort; 4) The scale of the political effort required by the Gates strategy of imposing “best” practices is forcing Gates to expand its staffing to levels where it is being paralyzed by its own administrative bloat; and 5) The false invocation of science as a political tool to advance policies and practices not actually supported by scientific evidence is producing intellectual corruption among the staff and researchers associated with Gates, which will undermine their long-term credibility and influence.

Tune in for Part 2.


UPDATE — For my suggestions of what the Gates Foundation could do instead, see this post.

The Gates Effective Teaching Initiative Fails to Improve Student Outcomes

June 21, 2018

Rand has released its evaluation of the Gates Foundation’s Intensive Partnerships for Effective Teaching initiative and the results are disappointing.  As the report summary describes it, “Overall, however, the initiative did not achieve its goals for student achievement or graduation, particularly for LIM [low income minority] students.” But in traditional contract-research-speak this summary really under-states what they found.  You have to slog through the 587 pages of the report and 196 pages of the appendices to find that the results didn’t just fail to achieve goals, but generally were null to negative across a variety of outcomes.

Rand examined the Gates effort to develop new measures of teacher effectiveness and align teacher employment, compensation, and training practices to those measures of effectiveness in three school districts and a handful of charter management organizations.  According to the report, “From 2009 through 2016, total IP [Intensive Partnership] spending (i.e., expenditures that could be directly associated with the components of the IP initiative) across the seven sites was $575 million.”  In addition, Rand estimates that the cost of staff time to conduct the evaluations to measure effectiveness totaled about $73 million in 2014-15, a single year of the program.  Assuming that this staff time cost was the same across the 7 years of the program they examined, the total cost of this initiative exceeded $1 billion.  The Gates Foundation paid $212 million of this cost, with the rest being covered primarily by “site funds,” which I believe means local tax dollars.  The federal government also contributed a significant portion of the funding.

So what did we get for $1 billion?  Not much.  One outcome Rand examined was whether the initiative made schools more likely to hire effective teachers.  The study concluded:

Our analysis found little evidence that new policies related to recruitment, hiring, and new-teacher support led to sites hiring more-effective teachers. Although the site TE [teacher effectiveness] scores of newly hired teachers increased over time in some sites, these changes appear to be a result of inflation in the TE measure rather than improvements in the selection of candidates. We drew this conclusion because we did not observe changes in effectiveness as measured by study-calculated VAM scores, and we observed similar improvements in the site TE scores of more-experienced teachers.

Another outcome was the increased retention of effective teachers:

However, we found little evidence that the policies designed, in whole or in part, to improve the level of retention of effective teachers had the intended effect. The rate of retention of effective teachers did not increase over time as relevant policies were implemented (see the leftmost TE column of Table S.1). A similar analysis based only on measures of value added rather than on the site-calculated effectiveness composite reached the same conclusion (see the leftmost VAM column of Table S.1).

Did the program improve teacher effectiveness overall and specifically access by low income minority students to effective teachers?

…An analysis of the distribution of TE based on our measures of value added found that TE did not consistently improve in mathematics or reading in the three IP districts. There was very small improvement in effectiveness among mathematics teachers in HCPS [Hillsborough County] and SCS [Shelby County] and larger improvement among reading teachers in SCS, but there were also significant declines in  effectiveness among reading teachers in HCPS and PPS [Pittsburgh]. In addition, in HCPS, LIM students’ overall access to effective teaching and LIM students’ school-level access to effective teaching declined in reading and mathematics during the period of the initiative (see Table S.2). In the other districts, LIM students did not have consistently greater access to effective teaching before, during, or after the IP initiative.

And was there an overall change as a result of the program in student achievement and graduation rates?

Our analyses of student test results and graduation rates showed no evidence of widespread positive impact on student outcomes six years after the IP initiative was first funded in 2009–2010. As in previous years, there were few significant impacts across grades and subjects in the IP sites.

Here I think the report is casting a more positive spin on the results than their findings show.  Check out this summary of results from each of the sites:

I see a lot more red (significant and negative effects) than green (significant and positive). The report’s overall conclusion is technically true only because it focuses just on the last year (2014-15) and because it examines each of these 4 sites separately.  A combined analysis across sites and across time, which they don’t provide, would likely show a significant and negative overall effect on test scores.

The attainment effects are also mostly negative.  To find the attainment results at all, you have to dive into a separate appendix file.  There you will see that Pittsburgh experienced a decrease in dropout rates of between 1.3 and 3.5%, depending on the year, which is a positive result.  But Shelby County showed a significant decrease in graduation rates in every year but one.  While dropout, unlike grad rate,  is an annualized measure, the decrease in Shelby County’s graduation rate was as large as 15.7%.  The charter schools also showed a significant decrease in graduation rates as a result of the program in every year but one, with the decline as large as 6.6%.  And Hillsborough experienced a significant increase in dropout rate in one year of about 1.5%.  In three of the four sites examined there were significant, negative effects on attainment. In one site there were positive effects on attainment.

The difference in difference analysis that Rand is using is not perfect at isolating causal effects.  And as the report notes, comparison districts were also sometimes implementing similar reform strategies as the Partnership sites.  But you would expect that the injection of several hundred million dollars and considerable expert attention would improve implementation in the Partnership districts, so the comparison is still informative.  Besides, the fact that some comparison districts were pursuing some of the same reforms does not explain the splattering of red (negative and significant effects) we see.

As Mike McShane and I note in the book we recently edited on failure in education reform, there is nothing inherently wrong with trying a reform and having it fail.  The key is learning from failure so that we avoid repeating the same mistakes.  It is pretty clear that the Gates effective teaching reform effort failed pretty badly.  It cost a fortune.  It produced significant political turmoil and distracted from other, more promising efforts.  And it appears to have generally done more harm than good with respect to student achievement and attainment outcomes.

The Rand report draws at least one appropriate lesson from this experience:

A favorite saying in the educational measurement community is that one does not fatten a hog by weighing it. The IP initiative might have failed to achieve its goals because the sites were better at implementing measures of effectiveness than at using them to improve student outcomes. Contrary to the developers’ expectations, and for a variety of reasons described in the report, the sites were not able to use the information to improve the effectiveness of their existing teachers through individualized PD, CLs, or coaching and mentoring.


Review of Letters to a Young Education Reformer

April 23, 2017

Below is an edited version of a review of Rick Hess’ new book that John Thompson, educator and frequent internet commentator, sent to me.  While I’m sure that John and I do not see eye to eye on all things (I think I’m much shorter), I find his perspective valuable and there is much in this review that I find useful.  The original came in at over 5,000 words spread across multiple posts, but with his permission I have edited it down to about 2,000 words in a single post. Enjoy.

(Guest Post by  John Thompson)

I’m not sure that I completely believe him, but Rick Hess concludes his Letters to a Young Education Reformer by saying he’s not a nice guy. He chides the last generation of school reformers not for the not-nice things they’ve done, but for ignoring too many key tenets of professionalism. He also shares some valuable thoughts with “‘far-from-young,’ reformers,” and veteran teachers like me who still have a hard time grasping how and why the accountability-driven, competition-driven social engineering experiment was imposed on our nation’s schools.
Hess describes himself as a “little-r reformer,” as opposed to a “Big-R reformer.” Little-r reformers believe that schools can do a far better job, and that schooling must be reimagined. They are less confident than Big-R Reformers that they know the answers. Hess seeks a “big-tent” approach to education, and a small d-democratic vision for public education. Big-R Reform, however, “has congealed into a set of prescriptions, it has grown more bureaucratic and self-assured, and further and further removed from the intuitions of little-r reform.”
Similarly, Big-P Philanthropy has enabled the hubris of Big-R Reform, and furthered the move towards the micromanaging of diverse schools across the nation. When Big-R Reform, Big-Philanthropy, and an activist federal Department of Education join together in an effort to social engineer public education, dissent can be quashed. I would add that when the clash of ideas is driven out of schools, the way that it often has been during the last 15 years, democracy is undermined.
Hess explains to young reformers why they should learn to control their passion. Thinking that they are uniquely on the side of angels, reformers pushed the soundbite, “This is about kids, not adults!” In doing so, novice reformers remained oblivious to another of Hess’ truisms – “implementation matters.” Real world, Hess explains, “For better or worse, good schools are the product of thousands of tiny judgments that those educators make every day.” So, by definition, if you want to improve kids’ lives, continually disrespecting teachers is not the way to transform “the status quo.”
Hess does a great job in explaining how and why reformers often display little patience for opposing ideas or for obstacles to their grand theories. First, they were in much too much of a hurry to learn from history’s missteps. Reformers, who often had two or three years of classroom experience – or less – quickly developed an extreme case of “groupthink.” Not knowing what they didn’t know about the history of “silver bullets” that have been hurriedly and repeatedly dumped on our schools, “this or that group of reformers” have demonstrated a clear pattern where they “settle on an agenda and then dismiss doubters as troublemakers.”
As much as it pains me to admit this about a conservative, Hess offers the single most telling anecdote illustrating the irrationalities that groupthink can produce. In late 2002, Hess attended a secret briefing at the Pentagon about the Bush administration’s educational mission in Iraq. It was clear that nobody had much of an idea regarding the situation they would be facing. One issue dominated the meeting, however. Iraq needed its own version of No Child Left Behind!
Hess isn’t a fan of high-stakes testing and he is skeptical of the value-added teacher evaluations that were pushed by the Gates Foundation and the Duncan administration. He credits reformers for ending the “old stupid,” or ignoring data systems, while concluding “the slapdash embrace of half-baked data is ‘the new stupid.’” Hess estimates that test scores “reflect 30 to 35% of what we want schools to do.” The use of those metrics was supposed to move us into the “moneyball,” or the data-informed baseball coaching that was popularized by Michael Lewis. Real world, data moved schools into the pre-moneyball era. But, reformers chose to act nice by talking about teachers as if they are girl scouts. They then used value-added models in ways that teachers were bound to see as a “hatchet job.”
One of the best things about Hess, the Little-r reformer, is that he advises reformers to learn from history’s missteps. He understands that “implementation matters,” but that reformers can have little patience for opposition or obstacles to the experiments that they mandate. As Hess has watched “this or that group of reformers settle on an agenda and then dismiss doubters as troublemakers,” he has been dismayed by the “groupthink” that has grown out of their frustrations.
My favorite Hess statement is that Big-R Reformers “learned the lyrics, not the music.” I’ve repeatedly heard reformers, who had little or no experience in the classroom, complain that the attaching of stakes to test scores did not need to produce teach-to-the-test, basic skills instruction. They demand that “everyone sing from the same hymnal” but deny that any words in the lyrics require drill and kill. Being clueless about the people side of schooling, Big-RReformers never understood that it was not what they said that matters. What matters is what school systems would hear.
Of course, test-driven accountability, as well as using test scores as the ammunition in the fight between charters and neighborhood schools, forced administrators and teachers to engage in bubble-in malpractice. The big harm came from the rapid scaling up of high-stakes testing directed at individual teachers and students, and charters.  Even in the early days of No Child Left Behind, educators had plenty of options for pretending to comply with mandates while, predictably, shutting their classroom doors and continuing to teach in the same old, good and bad ways.  Reformers responded by doubling down on both the punitive in terms of both the survival of schools and the evaluations of individuals, and by a “growing fascination with PR campaigns and political strategies.”
As with NCLB, the Obama administration imposed quantifiable targets that obviously were impossible to meet. I don’t know when Hess attended the meeting described in Letters to a Young Education Reformer, but he recalls “a no-nonsense veteran” state administrator in Florida who said he could manage about seven turnarounds. The audible shock that he prompted would have been funny if it hadn’t illustrated the reality-free nature of the campaign for mass transformations of the lowest-performing 5% of schools.
Hess is especially perceptive in diagnosing the predictable failure of Race to the Top (RttT) and School Improvement Grants (SIG). I don’t know how many 500-page RttT applications on a nineteen-item checklist Hess read but he reached the same conclusion that I did after studying many of them. There was no need to read the lyrics when the RttT hit an unmistakable chord. The applications’ words didn’t explicitly forbid the investment of time and money into the aligned and coordinated student supports that would have provided the foundation necessary for increases in meaningful learning. The timeline and the accountability metrics made it inevitable that hurried, in-one-ear-out-the-other, teach-to-the-test would take off.
In his dealings with national reformers, Hess saw what I witnessed on a local level. Reform leaders enthusiastically embraced the RttT even though “many of the folks in charge had – until about five minutes earlier – been eloquent in explaining how bureaucracy had stymied school reform.”  They had sincerely prided themselves on their opposition to red tape and their entrepreneurialism, but they turned on a dime because, “When your buddies go off to war, you go with them.”
Hess then nails the dynamics which, I believe, made the damage done by corporate reform increase during the Obama years, “When foundations and the federal government link arms, disagreeing with the president’s policies is tantamount to attacking the foundation’s agenda – and vice versa.” He then calls for “little-p rather than big-P philanthropy,” more rethinking, and less defending of the agenda of the moment.
Hess also witnessed the rise of the public relations campaigns that grew out of the effort to immediately impose transformative change. It looks to me that Big-R Reform peaked in the early Obama years when teacher-bashing propaganda like Waiting for Superman was dominant. Hess adds telling details about the way Big-R Reformers sought An Inconvenient Truth for school reform. He clearly remembers one PR pro who said the reform message needed to be “simpler, stupider, and snazzier.” At the time, it was argued that reformers were “too thoughtful for their own good.”
 For years, I tried to explain to Democrats who pushed the Big-R Reform agenda that in education it’s the lyrics, not the music that matters but perhaps Hess is better at getting that point across. I would argue that a huge reason for miscommunication is that Big-R Reformers were disgusted by the timidity, the “culture of compliance,” of school systems and they tried to intimidate the education sector into courageousness.
Not understanding the education sector’s culture of powerlessness, as well as a history of “silver bullets” being continually imposed on schools, Big-R Reformers couldn’t understand why systems remained so cautious. This prompted impatient reformers to become even more strident that punishments must always accompany rewards. They seemed to see disincentives as a normative and essential component of policies, and they seemed frustrated that educators focused on the punitive, not the incentives that corporate reformers also helped fund.
The best example of systems focusing on the music and not the lyrics is the predictable manner in which systems responded to value-added teacher evaluations.  When educators encountered the test score growth models, that were inherently biased against teachers in the highest challenge schools, administrators weren’t likely to listen to the words of reformers who presented new teacher evaluations as a means of recruiting and retaining talent in the inner city.
Reformers would explain that the use of “multiple measures” would make value-added scores less inaccurate than other measures. Reformers were reluctant to put estimates of the inaccuracy rate on paper, but I often heard the guess-timate of 5 to 10%. Somehow, Big-R Reformers failed to comprehend that that would mean that inner city teachers, especially, would face that much of a chance per year of having their careers damaged or destroyed by statistical errors. Reformers seemed incapable of putting themselves in the shoes of educators and understanding why systems would profess support for the measures but then “monkey wrench” them so that only 2% or so of teachers would be dismissed.
Rather than refight the big battles where smart people read the same evidence in different ways, I’ll close with a NAEP test score chart cited by reformer Kevin Huffman in support of the contemporary reform movement. I’d argue that Huffman’s evidence makes a powerful case against his approach to school improvement. (Huffman was debating conservative Jay Greene. Once again, this liberal respects the analysis of a conservative reformer more than the neo-liberal or liberal reformers in the debate.)
Huffman’s graphic shows that 4th grade reading rose nearly 20 points from 1996 to 2015. My first reaction is that reform has shown some success in improving math instruction, especially in the early years. That should not be a surprise given the sorry state of math instruction, especially in elementary schools, that I’d always seen as the norm. (Similarly, it should not be a surprise when input-driven reforms, like increasing high-quality tutoring or adding counselors or mentors, raise student performance but that is not evidence in favor of output-driven reform.)
However, reform has largely failed to raise reading scores, especially in the older years. There also is a simpler example of how reformers twist themselves into pretzels in order to view this evidence as supportive of reform.
The first years when NCLB could have started to improve schools would have been around 2002 or 2003. Fourth grade test scores increased more in the seven years before 2003 than they did in the twelve years that followed the law’s accountability system.  In other words, even the subject which produced the law’s greatest success does not provide support for the effectiveness of school reform.
I would argue that the metric which is most important is 8th grade reading, which is the most valuable skill and the most reliable NAEP test given to the older students. (It’s hard to evaluate the reliability of 12th grade tests.) Those reading scores increased about as much in the four years that preceded NCLB as they did in the thirteen years which followed 2002. And the same pattern applies to all of the data that Huffman presented. If anything, NAEP test score growth slowed after NCLB, and often it stopped after the Obama administration put NCLB accountability on steroids.
I would not argue that NAEP scores, alone, prove that reform failed. But clearly NAEP scores don’t provide evidence that output-driven, market-driven reform increased student performance.
Some reformers reply with the idea that an accountability “meteor” hit schools in the late 1990s, so gains that preceded NCLB should be counted as evidence for its effectiveness. I don’t know how, but some smart reformers may see this argument as something other than intellectual dishonesty and/or Alt Truth. But that opens even more cans of worms in terms of why smart people see the same education evidence in very different ways.
And that brings us back to why we need Letters to Young Education Reformers. The current and new generation of reformers may not find this to be comprehensible, but they need to know that there was a time when teachers were allowed to teach according to their professional judgments and when the economy boomed, student performance increased markedly – more than anything accomplished by the test-driven, competition-approach to increasing student performance.
Gosh, I remember a day when teachers who taught in a meaningful and culturally relevant manner, and treated students as whole human beings, did not have to fear for their jobs for having the temerity to do so. If I go too far down that road, however, I’ll betray myself as even older than Rick Hess.
Finally, somebody needs to write: Letters to a Young Education Reformer, Obama Loyalist to Obama Loyalist.
(edited to correct error in book title)

Tweets as a Window Into Foundation Strategy, Part 2

February 17, 2017

Image result for twitter bird flying through window

In my last post I described a method for understanding what ed reform foundations are really pursuing by examining the content of Tweets issued by their grantees.  When some assistants and I conducted this analysis we found that ed reform foundation grantees devote significantly more energy to promoting diversity than promoting school choice. In the prior post I wondered whether this strategy of emphasizing diversity relative to choice is wise given Republican dominance of state governments, where most education policy is formulated and implemented.

The grantees of major ed reform foundations not only give a lower priority to advocating for school choice — both charters and private school choice — but they also seem to prefer top-down accountability approaches over parental empowerment.  It was too difficult for non-expert research assistants to judge whether Tweets championed accountability to regulators as opposed to accountability to parents, but they could reliably count the number of Tweets that mentioned the words accountability, quality, and equity.  When Tweets are advocating top-down accountability they tend to use these words since they typically do not envision having to answer to parents as accountability and because they often argue that quality and equity are the goals of their top-down regulatory efforts.  Of course, some Tweets that use these words are not advocating top-down accountability, but it is also the case that one does not need to specifically mention the words accountability, quality, or equity to be arguing for top-down accountability.  While obviously imprecise, I think the number of Tweets talking about accountability, quality, or equity is a reasonable proxy for support of top-down accountability approaches.

If we compare the number of Tweets using any of these three words to the number of Tweets advocating school choice, we find far greater emphasis on top-down accountability than choice.  Among grantees of the Gates Foundation, Tweets mentioned accountability, quality, or equity 6.7 times as often as they advocated school choice.  Among Broad Foundation grantees the ratio was 3.1.  Arnold Foundation grantees mentioned accountability, quality, or equity 1.9 times as often as they advocated choice.  And at the Walton Foundation the figure was 1.0, representing a relatively even emphasis on top-down accountability and choice.

Another indication of how much foundation grantees favored top-down accountability relative to parental empowerment could be found in how they reacted to Betsy DeVos’ nomination for Secretary of Education.  Keep in mind that the time-period we examined was October 1 to December 15 of 2016, so DeVos had just been nominated toward the end of that period.  In addition, she had not yet testified, so support or opposition of her nomination was a reaction to her perceived position on issues rather than her command (or lack thereof) of the details of education policy.  Much more opposition to DeVos was mobilized and expressed after her confirmation hearings, which was after the time period we examined.  Lastly, it is important to consider that DeVos is a relatively centrist Republican reformer.  Her supporters included moderate advocates of top-down accountability, while opposition to her was marked by hostility to parental empowerment or support for choice only if it was accompanied by fairly strong top-down accountability measures.

When my assistants coded Tweets as supporting or opposing DeVos they found that grantees of the Broad Foundation opposed her 2 to 1, although this was based on a small number of Tweets.  Given that Eli Broad ultimately wrote a public letter opposing DeVos, this result is not surprising but does provide some validation for the method of analyzing Tweets as a window into foundation strategy.  Gates Foundation grantees had slightly more Tweets against DeVos than favoring her.  But among the Arnold and Walton foundation grantees, support for DeVos was much stronger, with Tweets 2.9 and 5.9 times more likely to support than oppose her, respectively.

Foundations can, of course, support whatever causes they prefer.  The major education reform foundations do not have to be enamored with school choice, can devote the bulk of their energy to promoting diversity, and can take whatever positions they like with respect to top-down accountability and Betsy DeVos.  My point in reporting these results is simply that the causes being championed by the grantees of these major education reform foundations may differ significantly in some ways from what many people think ed reform foundations support.  These causes being championed may even differ significantly from what the foundation staffs or boards think they are supporting.  The evidence suggests that major ed reform foundation grantees give far higher priority to advocating for diversity than for school choice, seem to favor top-down accountability more than parental empowerment, and sometimes only offer tepid support or even opposition to moderate Republican reformers.

Political Science for Ed Reform Dummies

August 8, 2016

Despite the fact that they imagine themselves to be politically sophisticated, the leadership of the ed reform movement seems to be badly in need of some basic lessons in political science.  Political failures like Common Core, Portfolio Management in New Orleans, electoral defeats (like those recently in Tennessee), and repudiations of the entire ed reform agenda by Black Lives Matterthe NAACP, and the DNC despite extensive courting by left-leaning ed reformers suggest that ed reform might benefit from learning a thing or two about how politics works.

Lesson #1 — Concentrated interests have an advantage over dispersed interests

This is one of the most well-established and older insights from political science articulated most clearly in James Q. Wilson’s book, Political Organizations.  In education, the unions and others who benefit from the status quo system are concentrated interests.  They know how proposed changes would help or hurt them and they can easily be mobilized to donate, agitate, and vote to protect their concentrated interests.  Parents and taxpayers may be more numerous, but they are dispersed.  Each proposed change is likely to help or hurt them less than employees of the school system and parents/taxpayers (unlike school employees) do not all work in the same place and regularly share information.  This makes parents/taxpayers more dispersed and much harder to organize politically.

The political implications are that reformers will rarely be able to defeat the unions in a direct political assault nor can reformers expect to maintain control over boards or other centers of power over the education system.  Even with large campaign war-chests, as reformers had in the recent Tennessee elections, the unions and their allies will tend to prevail because they have a larger number of concentrated beneficiaries who campaign and vote.  Even with an army of DC organizations endorsing Common Core, the unions and their allies have more boots on the ground in each state to neuter or repeal standards-based reforms. Even with the financing and support of major reform foundations, the unions and their allies could hijack Portfolio Management in New Orleans, returning control to the local school board whose elections they dominate.  All of these failures could have been avoided if reformers understood this first lesson.

Despite the advantage that the concentrated interests of the unions and their allies have, reformers can achieve political victories if they keep two thing in mind. First, they shouldn’t build centralized institutions for controlling education because inevitably the unions and their allies will gain control over those institutions and use them to promote their own interests.  Don’t build a national system of standards with aligned tests because the edublob will gain control over it if they do not first block or repeal it. Don’t build Portfolio Managers governing all schools because inevitably  the unions and their allies will grab control over that. If reformers understand that they do not have concentrated interests on their side, they should prefer decentralizing control so that it’s hard even for the unions to seize all of the levers of power in one fell swoop.  If there are multiple authorizers, multiple and independent schools, and the revenue and policies of those schools are determined more by parents than bureaucrats, the unions and their allies can’t control all of it.

Second, reformers need to generate their own concentrated interest groups that have a better chance of competing politically with the unions.  The best way to do that is to expand choice so that the beneficiaries of those programs have an interest in protecting those programs.  And families at a choice school are more concentrated physically so that they can be better informed and organized for political action.  The way to fight the organized interests of traditional school employees is by creating organized interests of choice parents.

Reformers have had a bad habit of being attracted to policies that do not generate constituents to protect or expand those programs.  No one ever held a rally to test kids more.  No one ever held a rally to support test-based evaluations of teachers and schools. No one ever held a rally to increase the share of informational texts in reading standards or to ensure that uniform tests are aligned with a particular set of standards.  No one ever held a rally to regulate choice schools more closely so that the range of options is restricted or schools are more constrained.

Whatever merits these policies may have (and their merits are dubious), they are all political losers.  Eventually all of these policies will be blocked, diluted, or co-opted by concentrated interests. But the great political virtue of school choice is that it generates its own constituents who can then be mobilized to protect and expand choice.  Once parents have expanded choices it is extremely difficult to take that away politically and it rarely happens. The same cannot be said for top-down accountability and regulation reforms.

Lesson #2 — Higher income people have more political power than lower income people

As much as reformers may be motivated to promote equity, a basic lesson about political reality is that more advantaged people tend to have more political power.  Rather than lament this fact, reformers should try to use it to advance their goals.  The old political adage that programs for the poor tend to be poor programs is all too true.  Reformers have made horrible political mistakes in concentrating programs in disadvantaged areas, means-testing participants, and focusing on options that are mostly of interest to lower income families.  Not only do these program tend to be less-well funded, overly regulated, and generally of lower quality, but they are always highly vulnerable to being weakened further or eliminated.

To increase the odds of having better quality programs that are more generously funded and more reasonably regulated, reformers should be sure to include higher income families as potential beneficiaries.  And those wealthier families are more likely to be mobilized politically to protect and expand programs.  If reformers should seek to organize concentrated interests of beneficiaries, it would really help if they did not exclude higher income families that tend to have better resources, networks, and experience to participate more effectively in politics.

Lesson #3 — Winning requires 51% of the legislature

Most significant education reforms have been championed by Republican governors and state legislators with relatively little support from the other side of the aisle.  Reformers have invested an enormous amount of time, rhetoric, and money in trying to convince more Democrats to support meaningful reform, but these efforts have made few in-roads.  Reforms continue to be passed largely by Republican governors and legislators.

The problem is that elected Democrats continue to rely heavily on the concentrated interests of the unions and their allies for political support.  Reformers can’t peel away Democratic officials with targeted campaign donations because the unions and their allies can match or exceed those donations.  In addition, the unions can bring an army of campaign volunteers and voters to each election in a way that reformers generally cannot, especially in Democratic districts.  In addition, reform arguments about equity or social justice cannot shame Democrats into supporting their cause.  The unions have their own arguments about equity and social justice that, regardless of their merits, provide a sufficient fig-leaf to Democratic office-seekers.

Rather than chasing Democrats who mostly cannot be won over, reformers should remember that it only takes 51% of the legislature to win.  In states with Republican control of the legislature and/or governor’s mansion (which is currently most states), reformers can make policy entirely or largely with the support of Republicans.  Reformers should craft policies and adopt rhetoric that appeal to the Republicans they need to reach 51%.  Increasingly hyperbolic rhetoric about equity and social justice not only fails to win over Democrats, but it turns away some Republican reformers.  When Republicans are already the majority, reformers should concentrate on keeping all of them on board rather than chasing Democrats they don’t need or can’t get.

Lesson #4 — Helping your political enemies hurts you

This may seem too obvious to need articulating, but in an age where we ship $400 million in cash to the Iranians, this lesson apparently requires emphasis.  Certain groups are allied with the unions and their friends, so reformers shouldn’t provide them with money, a platform for advocating their ideas, or political rhetoric to support them.

In a fit of political naivete, the Gates Foundation actually gave money to the teacher unions in the belief that they could secure support from the unions for Common Core.  The unions happily took the money, made empty promises about support, and promptly betrayed Common Core as soon as it was convenient.  The New School Venture Fund gave a platform to Black Lives Matter at its national conference only a few months before that organization repudiated charter schools and the rest of the traditional reform agenda.

The Uber Model

The political strategy adopted by Uber provides a useful model for ed reformers.  Uber understood that taxi companies were concentrated interests in each market and therefore better positioned to block regulatory changes that would allow Uber to operate more freely.  Rather than attempting a futile direct political assault, Uber focused on creating beneficiaries of its services who could then be organized politically to fight against the taxi companies.  And while low income people are among the greatest beneficiaries from Uber’s services, Uber was determined not to exclude higher income people from its services.  They understood that once wealthier urbanites experienced Uber, they could be mobilized more easily and effectively to fight for it.  The poor benefit by having the rich fight for policy changes that benefit the rich and poor alike.

Uber does not chase the support of politicians who are determined to oppose them.  And Uber does not give money to taxi companies in the false hope that they will come around to seeing the virtues of greater competition with Uber.  In short, Uber seems to understand these 4 basic lessons of political science in a way that reformers have not.

If reformers adopted a political strategy more akin to what Uber is doing, they would enjoy far greater success.  Reformers should avoid building centralized systems of control that the unions will eventually dominate.  Reformers should focus on policies that generate their own constituents, like expanding school choice, rather than policies that only technocrats could love.  And  reformers should be sure that higher income families are among the beneficiaries of those choice programs so that their greater political power could be mobilized to more effectively fight to protect and expand programs.  Reformers should also craft policies and adopt rhetoric that appeal more to the legislators they are likely to get to reach 51%.  And it should go without saying that reformers should avoid providing support to groups that are are fundamentally opposed to the reform agenda.