Swedish Education Irony Alert!

April 4, 2012

Meet the two coolest things ever made in Sweden.

(Guest post by Greg Forster)

In the new issue of NR, the invaluable Kevin Williamson profiles Massachussetts Senate candidate Elizabeth Warren. He writes that in a book they co-wrote, Warren and her daughter “offer an array of policy prescriptions ranging from the mild (decoupling public-school assignments from geography) to the Swedish (subsidizing stay-at-home parents)…”

Oops! It’s actually “decoupling public-school assignments from geography” that’s the Swedish idea here. Sweden has had a national system of universal school vouchers since 1993. They’ve even developed economically sustainable for-profit school companies. It’s so successful that about a year ago the Social Democratic Party, which I’m tempted to describe as Sweden’s socialist party but will instead describe as its more socialist party, decided not to try to kick the for-profit schools out of the system.

Williamson does have a number of good words for Warren, including this nugget, which ed reformers will particularly enjoy reading:

Warren taught public school briefly and then quit rather than go through the obligatory, despair-inducing credentialing rigmarole (a fact that speaks better of her than almost anything else you’ll learn).


David Gergen on Teach for America

March 27, 2012

(Guest Post by Matthew Ladner)


ABCTE Teachers Outgain the Competition

March 7, 2012

(Guest Post by Matthew Ladner)

An analysis of Florida test score data from Georgia State Economist Tim R. Sass provides encouraging news for supporters of alternative teacher certification. The Florida data warehouse contains information about the route that teachers took for certification, and information about the types and number of courses taken in college. Sass includes a number of tables on background characteristics of teachers, and finds that alternatively certified teachers tend have higher SAT scores and took more math courses in college than traditionally certified teachers.

Sass performs an analysis of student learning gains by certification route, and finds that alternatively certified teachers have similar academic gains to traditionally certified teachers. This is similar to the findings previous certification studies. Sass however found better than average results for ABCTE:

The performance of ABCTE teachers in teaching math is substantially better, on average, than for preparation program graduates. Across all specifications and tests, ABCTE teachers boost math achievement by six to eleven percent of a standard deviation more than do traditionally prepared teachers.

The ABCTE route receives no state money and costs a fraction what students must pay for the College of Education route. Sass rightly cautions that the ABCTE cohort is not huge (there are multiple different routes to certification in Florida) so there should be further research conducted. Like the TFA research, the gains for reading are much smaller than those for math, which merits further investigation. The cut-scores for the ABCTE content knowledge exams are challenging, so it is gratifying to see the ABCTE teachers achieving larger student learning gains.

The philanthropists who have strongly supported Teach for America over the years should take note of these findings. The universe of potential career switchers with solid content backgrounds can add to the ultimately limited pool of Ivy League students willing to serve through TFA, and our students need all the help they can get.

As for teacher certification and Colleges of Education why do we have those again? The descriptive tables in the Sass study show that alternative certification can be a method for increasing the selectivity of the teaching pool (higher college entrance scores, more content knowledge courses, etc.). The results of this study reinforce previous findings that whatever is going on during those 30 hours of course work, it doesn’t seem to have much to do with better student results on the back-end.


Anticipating Responses from Gates

January 9, 2012

Over the weekend I posted about how I thought the Gates Foundation was spinning the results of their Measuring Effective Teachers Project to suggest that the combination of student achievement gains, student surveys, and classroom observations was the best way to have a predictive measure of teacher effectiveness.  Let me anticipate some of the responses they may have:

1) They might say that they clearly admit the limitations of classroom observations and therefore are not guilty of spinning the results to inflate their importance.  They could point to p. 15 of the research paper in which they write: “When value-added data are available, classroom observations add little to the ability to predict value-added gains with other groups of students. Moreover, classroom observations are less reliable than student feedback, unless many different observations are added together.”

Response: I said in my post over the weekend that the Gates folks were careful so that nothing in the reports is technically incorrect.  The distortion of their findings comes from the emphasis and manner of presentation.  For example, the summary of findings in the research paper on p. 9 states: “Combining observation scores with evidence of student achievement gains and student feedback improved predictive power and reliability.”  Or the “key findings” in the practitioner brief on p. 5 say: “”Observations alone, even when scores from multiple observations were averaged together, were not as reliable or predictive of a teacher’s student achievement gains with another group of students as a measure that combined observations with student feedback  and achievement gains on state tests.”  Notice that these summaries of the results fail to mention the most straightforward and obvious finding: classroom observations are really expensive and cumbersome and yet do almost nothing to improve the predictiveness of student achievement-based measures of teacher quality.

And the proof that the results are being spun is that the media coverage uniformly repeats the incorrect claim that multiple measures are an important improvement on test scores alone.  Either all of the reporters are lousy and don’t understand the reports or the reporters are accurately repeating what they are being told and what they overwhelmingly see in the reports.  My money is on the latter explanation.

And further proof that the reporters are being spun is that Vicki Phillips, the Gates education chief, is quoted in the LA Times coverage mis-characterizing the findings: “Using these methods to evaluate teachers is ‘more predictive and powerful in combination than anything we have used as a proxy in the past,’ said Vicki Phillips, who directs the Gates project.”  This is just wrong.  As I pointed out in my previous post, the combined measure is no more predictive than student achievement by itself.

Lastly, the standard for fair and accurate reporting of results is not whether one could find any way to show that technically the description of findings is not false.  We should expect the most straightforward and obvious description of findings emphasized.  With the Gates folks I feel like I am repeatedly parsing what the meaning of the word “is” is.  That’s political spin, not research.

2) They might say that classroom observations are an important addition because at least they provide diagnostic information about how teachers can improve, while test scores cannot.

Response:  This may be true, but it is not a claim supported by the Gates study.  They found that all of the different classroom observation methods they tried had very weak predictive power.  You can’t provide a lot of feedback about how to improve student achievement based on instruments that are barely correlated with gains in student achievement.  In addition, they were unable to find sub-components of the classroom observation methods that were more predictive, so they can’t tell teachers that they really need to do certain things, since those things are much more strongly related to student learning gains.  Lastly, it is simply untrue that test scores cannot be diagnostic.  There are sub-components of the tests that measure learning in different aspects of the subject.  Teachers could be told to emphasize more those areas on which their students have lagged.

3) They may say that classroom observations and students surveys improve the reliability of a teacher quality measure when combined with test scores.

Response: An increase in reliability is cold comfort for a lack of predictive power.  Reliability is just an indicator of how consistent a measure is.  There are plenty of measures that are very consistent but not helpful in predicting teacher quality.  For example, if we asked students to rate how attractive their teacher was, we would probably get a very “reliable” (consistent) measure from year to year and section to section.  But that consistency would not make up for the fact that attractiveness is unlikely to help improve the prediction of effective teaching.  So, the student survey has a high amount of consistency, but who knows what that is really measuring since it is only weakly related to student learning gains.  It is consistent, but consistently wrong.  Our focus should be on the predictive power of teacher evaluations and classrooms observations and student surveys don’t really do anything to help with that (at least, not according to the Gates study).

4) They may say that classroom observations and student surveys improve on the prediction of student effort and classroom environment.

Response: As I mentioned in the post over the weekend, they don’t really have validated measures of student effort and classroom environment.  The Gates folks took a lot of flack last year for focusing on test-score gains, so they came up with some non-test score outcome measures simply by taking some of the items from the students survey where students are asked about their effort or classroom environment.  We have no idea whether they have really measured the amount of effort students exert or the quality of the classroom environment, they are just using some survey answers on those items and claiming that they have measured those “outcomes.”  The only validated outcome measure we have in the Gates study are the test score gains, so we have to focus on that.

—————————————————————————————————

The good news is that my fears about the Gates study being used to dictate what teachers do have not been realized, at least not yet.  But it wasn’t for lack of trying.  If the classroom observations had worked a little better in predicting student learning gains, I’m sure we would have heard about how teachers should run their classrooms to produce greater gains.  But the classroom observations were so much of a dud that gates education chief, Vicki Phillips, didn’t even attempt to claim that they found that drill and kill is bad or that teachers should avoid teaching to the test.

But the inability to use the classroom observations to tell teachers the “right” way of teaching is another way of saying that the classroom observations are not able to be used for diagnostic purposes.  The most straightforward reading of the Gates results is that classroom observations appear to be an expensive and ineffective dud.  But it’s hard for an organization that spends $45 million on a project to scientifically validate classroom observations to admit that it failed.   It’s hard enough for a third-party evaluator to say that, let alone an in-house study about a key aspect of the Gates policy agenda.


How the Gates Foundation Spins its Research

January 7, 2012

The Gates Foundation has released the next installment of reports in their Measuring Effective Teachers Project.  When the last report was released, I found myself in a tussle with the Gates folks and Sam Dillon at the New York Times because I noted that the study’s results didn’t actually support the finding attributed to it.  Vicki Phillips, the education chief at Gates,  told the NYT and LA Times that the study showed that “drill and kill” and “teaching to the test” hurt student achievement when the study actually found no such thing.

With the latest round of reports, the Gates folks are back to their old game of spinning their results to push policy recommendations that are actually unsupported by the data.  The main message emphasized in the new round of reports is that we need multiple measures of teacher effectiveness, not just value-added measures derived from student test scores, to make reliable and valid predictions about how effective different teachers are at improving student learning.

This is the clear thrust of the newly released Policy and Practice Brief  and Research Paper and is obviously what the reporters are being told by the Gates media people.  For example, Education Week summarizes the report as follows:

…the study indicates that the gauges that appear to make the most finely grained distinctions of teacher performance are those that incorporate many different types of information, not those that are exclusively based on test scores.

And Ed Sector says:

The findings demonstrate the importance of multiple measures of teacher evaluation: combining observation scores, student achievement gains, and student feedback provided the most reliable and predictive assessment of a teacher’s effectiveness.

But buried away on p. 51 of the Research Paper in Table 16 we see that value-added measures based on student test results — by themselves — are essentially as good or better than the much more expensive and cumbersome method of combining them with student surveys and classroom observations when it comes to predicting the effectiveness of teachers.  That is, the new Gates study actually finds that multiple measures are largely a waste of time and money when it comes to predicting the effectiveness of teachers at raising student scores in math and reading.

According to Table 16, student achievement gains correlate with the underlying value-added by teachers at .69. If the test scores are combined (with an equal weighting) with the results of a student survey and classroom observations that rate teachers according to a variety of commonly-used methods, the correlation to underlying value-added drops to be between .57 and .61.  That is, combining test scores with other measures where all measures are equally weighted actually reduces reliability.

The researchers also present the results of a criteria weighted combination of student achievement gains, student surveys, and classroom observations based on the regression coefficients of how predictive each is of student learning growth in other sections for the same teacher.  Based on this the test score gains are weighted at .729, the student survey at .179, and the classroom observations at .092.  This tells us how much more predictive test score gains are than student surveys or classroom observations.  Yet even when test score gains constitute 72.9% of the combined measure, the correlation to underlying teacher quality still ranges between .66 and .72, depending on which method is used for rating the classroom observations.  The criteria-weighted combined measure provides basically no improvement in reliability over using test score gains by themselves.

And using multiple measures does not improve our ability to distinguish between effective and ineffective teachers.  Using test scores alone the difference between the top quartile and bottom quartile teacher in producing  student value-added is .24 standard deviations in math learning growth on the state test.  If we combine test scores with student surveys and classroom observations using an equal weighting, the difference between top and bottom quartile teachers shrinks to be between .19 and .21.  If we use the criteria weights, where test scores are 72.9% of the combined measure, the gap between top and bottom teacher ranges between .22 and .25.  In short, using multiple measures does not improve our ability to distinguish between effective and ineffective teachers.

The same basic pattern of results holds true for reading, which can be seen in Table 20 on p. 55 of the report.  Combining test score measures of teacher effectiveness with student surveys and classroom observations does improve a little our ability to predict how students would answer survey items about their effort in schools as well as how they felt about their classroom environment.  But unlike test scores, which have been shown to be strong predictors of later life outcomes, I have no idea whether these survey items accurately capture what they intend or have any importance for students’ lives.

Adding the student surveys and classroom observation measures to test scores yields almost no benefits, but it adds an enormous amount of cost and effort to a system for measuring teacher effectiveness.  To get the classroom observations to be usable, the Gates researchers had to have four independent observations of those classrooms by four separate people.  If put into practice in schools that would consume an enormous amount of time and money.  In addition, administering, scoring, and combing the student survey also has real costs.

So, why are the Gates folks saying that their research shows the benefits of multiple measures of teacher effectiveness when their research actually suggests virtually no benefits to combining other measures with test scores and when there are significant costs to adding those other measures?  The simple answer is politics.  Large numbers of educators and a segment of the population find relying solely on test scores for measuring teacher effectiveness to be unpalatable, but they might tolerate a system that combined test scores with classroom observations and other measures.  Rather than using their research to explain that these common preferences for multiple measures are inconsistent with the evidence, the Gates folks want to appease this constituency so that they can put a formal system of systematically measuring teacher effectiveness in place.  The research is being spun to serve a policy agenda.

This spinning of the findings  is not just an accident or the results of a misunderstanding.  It is clearly deliberate.  Throughout the two reports Gates just released, they regularly engage in the same pattern of presenting the information. They show that the classroom observation measures by themselves have weak reliability and validity in predicting effective teachers.  But if you add the student survey and then add the test score measures, you get much better measures of effective teachers.  This pattern of presentation suggests the importance of multiple measures, since the classroom observations are strengthened when other measures are added.  The only place you find the reliability and validity of test scores by themselves is at the bottom of the Research Paper in Tables 16 and 20.  If both the lay-version and technical reports had always shown how little test scores are improved by adding student surveys and classroom observations, it would be plain that test scores alone are just about as good as multiple measures.

The Gates folks never actually inaccurately describe their results (as Vicki Phillips did with the previous report).  But they are careful to frame the findings as consistently as possible with the Gates policy agenda of pushing a formal system of measuring teacher effectiveness that involves multiple measures.  And it worked, since the reporters are repeating this inaccurate spin of their findings.

———————————————————————-

(UPDATE — For a post anticipating responses from Gates, see here.)


Teachers Matter

January 3, 2012

My friend and colleague, Marcus Winters, has a new book out on how to improve the quality of the teaching workforce.  Teachers Matter is an excellent summary of the literature on how best to recruit, train, and motivate teachers.  It’s a must-read for anyone interested in merit pay, credentialing, and teacher evaluation.  It’s a particularly good book to assign for classes that cover these subjects.  Check it out.


The Gates Foundation and the Rise of the Cool Kids

July 28, 2011

(Guest Post by Matthew Ladner)

Jay and Greg have been carrying on an important discussion concerning the Gates Foundation and education reform. I wanted to add a few thoughts.

Rick Hess and others have noted the “philanthropist as royalty” phenomenon in the past. Any philanthropist runs the danger of only hearing what they want to hear from their supplicants, and Gates as the largest private foundation runs the biggest risk. The criticism of the Gates Foundation I had seen in the past emanated from the K-12 reactionary fever swamp, hardly qualifying as constructive.

The challenge faced by philanthropists: how do you challenge your own assumptions and evaluate your own efforts honestly? Do you hire formidable Devil’s advocates to level their most skeptical case against your efforts?

I don’t know the answer to these questions, just that if I were Bill Gates I would be terrified of everyone telling me how right my thinking is because they want my money. This is however the best sort of problem to have…

Jay’s central critique of the Gates Foundation strategy seems to be that they have put too much faith in a centralized command and control strategy. They would be wise to entertain this thought. If command and control alone were the solution, then we wouldn’t have education problems-district, state and federal governance have all failed to prevent widespread academic failure for decades.

The Gates strategy does however embrace decentralization. Over the years they have supported charter schools, and fiercely opposed the worst one-size fits all policy of all: salary schedules and automatic/irrevocable tenure. Riley’s WSJ article makes clear that Gates understands the benefits of private school choice, but that he falls for the Jay Mathews fallacy of thinking it is just too politically difficult.

Sigh…perhaps next year Greg can make a dinner bet with Bill.

Gates is also the primary backer of Khan Academy. This new article on Sal Khan in Wired magazine makes clear that Khan understands the danger of being swallowed by school systems and that he is not going to allow it to happen. Khan academy is both radically decentralized and is in the early stages of being used by people within the centralized school system to improve outcomes.

Whatever the mistakes to date, the Gates Foundation has in my mind has succeeded in serving as a counter-weight to the NEA, mostly through funding the efforts of a myriad network of reform organizations collectively known as the Cool Kids. Today, there is a struggle for power going on within the Democratic Party over K-12 policy and the Gates Foundation deserves some credit in my mind for supporting  the ideas behind the “Democrat Spring” on education policy. This spring is following more of the Syrian than the Egyptian model thus far, but it is happening, and it is very important.

Does that mean that they are the “good guys” and Jay should lay off of them? Of course not-reasoned critiques of large philanthropists are in short supply for all of the factors cited above. Jason Riley wished that Gates were bolder in embracing decentralization reforms, but noted that in the end that it was the Gates rather than the Riley Foundation. This is absolutely true, but it doesn’t make the royalty problem go away, and leaves a continuous question of how the emperor gets feedback on his new clothes.

I don’t agree with the Cool Kids about everything. The next time I hear someone ask a question about having Common Core replace NAEP (the very pinnacle of naive folly) for instance I may pull out entire tufts of my graying, thinning hair in utter exasperation. Reformers of all stripes need to be on guard against the ship-wheel conceit, which is to imagine that if only my strong hands steered the ship, we’d sail through the rocky shoals of ed reform without a hitch.

The East Germans ran a much better economy than the North Koreans, much to the benefit of Germans and to the detriment of Koreans. This is real and important in human terms- I do not make this point glibly. I never heard about an East German famine decimating the population, but food shortages have even soldiers starving to death in North Korea (pity the women and children). Better quality management is good and desirable, but…it will only take you so far. Today, Chinese apparatchiks are noisily crediting themselves for the tremendous economic progress in China without the slightest hint of irony. Without the market forces Deng introduced and with more apparatchiks, China would revert back to a starving backwater. With fewer apparatchiks, her progress would almost certainly accelerate.

As Sara Mead correctly noted in this guest post at Eduwonk, today’s education debate largely involves a mixture of technocratic and market-based reforms (neo-liberals) on one side and a group of reactionaries lacking realistic solutions on the other. A third of our 4th graders can’t read and have been shoved into the dropout pipeline. We need both technocratic and market based reforms, and we need stronger reforms of both sorts than those fielded to date.

Jay’s critique concerns the right mix of reforms within the bounds of the neo-liberal consensus. This of course is a matter of debate, and debate is the path to deeper understanding. The sheer size of the Gates Foundation has the potential to stifle such debate as it relates to their efforts, even passively, and reformers should recognize the danger in allowing it to do so. This isn’t about them so much as it is about us.


Technology and School Choice: The False Dichotomy

July 18, 2011

(Guest post by Greg Forster)

Terry Moe has a great article in today’s Journal about how entrepreneurial innovation taking advantage of new technology is putting the teacher’s unions on the road to oblivion. It’s a great article, except that it draws one false dichotomy.

Fans of JPGB know that we do love us some high-tech transformation of schooling around here. Matt has been on this beat for a long time, and hardly a week goes by that he doesn’t update us on the latest victory of “the cool kids” over “edu-reactionaries” in the reinvention of the school. But he doesn’t own that turf entirely; I made this the theme of my contribution to Freedom and School Choice (as did Matt, of course).

The problem is that Moe insists high-tech transformation of schooling, and the destruction of union control it entails, is absolutely, positively a separate phenomenon from the wave of school reform victories this year:

This has been a horrible year for teachers unions…But the unions’ hegemony is not going to end soon. All of their big political losses have come at the hands of oversized Republican majorities. Eventually Democrats will regain control, and many of the recent reforms may be undone. The financial crisis will pass, too, taking pressure off states and giving Republicans less political cover…

Over the long haul, however, the unions are in grave trouble—for reasons that have little to do with the tribulations of this year…The first is that they are losing their grip on the Democratic base…Then there’s a crucial dynamic outside of politics: the revolution in information technology.

Really? The victories of 2011 – “the year of school choice” – aren’t in the same category with the long-term path to oblivion the unions are on? On the contrary, 2011 is the year of school choice precisely because it has become obvious that the unions are on track for oblivion, for the reasons Moe identifies.

Moe’s argument relies on the assumption that when Republicans are in power, they always make dramatic and innovative school reform policies their #1 priority.

Sorry  . . . lost my train of thought I was laughing so hard . . . let me pick myself up off the floor . . . there, now where was I? Oh, yes.

The GOP hasn’t touched real school reforms with a hundred-foot pole in years. Why did it all of a sudden embrace real reform this year?

Could it be because…

  1. …the unions are losing their grip on the Democratic base, meaning squishy Republicans don’t have to worry about being demonized as right-wing loonies simply for embracing real reform, and…
  2. …the revolution in information technology has made it obvious to MSM and other key cultural gatekeepers that the unions are the reactionaries, once again reassuring squishy Republicans they won’t be demonized for embracing real reform?

Obviously the financial crisis was also a factor here, as Moe rightly points out. But is that really an immediate-term phenomenon, bound to disappear next week? What really counts is whether the nation feels so rich it can afford to ignore ballooning school costs. Technically the recession ended two years ago and we’ve been in “recovery” for two years. How’s that feeling? Do we feel rich and luxurious again? Are we on track to restore a widespread national sense of inevitable prosperity by 2012? By 2014? By 2020?

Bottom line, the unions losing Democratic support and taking their stand in opposition to entrepreneurial change was the crucial, indispensable precondition for this year’s wave of school reform success.

Oh, and guess what? Sustaining those policies, especially school choice, will be the only way this wave of advancing technology will produce the results Moe is expecting. Only school choice can prevent the blob from neutralizing any reform you throw at it. If the techno-innovators turn their back on choice and competition, they’ll be dead meat. (For more on that topic, see the aforementioned chapter by your humble servant in Freedom and School Choice.)


Sea Change in Tenure Policy

July 13, 2011

(Guest Post by Matthew Ladner)

Ed Week delivers a solid piece on the changes around the states on teacher policy- LIFO, tenure reform, etc. Money quote:

Jennifer Dounay Zinth, a senior policy analyst at the Denver-based Education Commission of the States, which has been tracking the legislation closely, said the protracted interest in revamping the teaching profession amounts to a “sea change.”

“It’s hard to get your arms around—not just the number of bills being enacted but the breadth and depth of changes being made,” she said.

Note that while Red states are in the lead, even deep Blue states like Illinois have undertaken reform as well.

Randi Weingarten seems to have noticed, as the NYT reports:

Ms. Weingarten, who has long opposed the cuts — both budgetary and rhetorical — made to teachers, told her audience that the current debate on education “has been hijacked by a group of self-styled reformers” from “on high” who want to blame educators’ benefits and job security for states’ notorious budget problems. Calling the union gathering “an affirmation,” she countered that change to the education system should instead come through greater community support for teachers themselves and recognition for the commitment to children they already demonstrate. 

Hijacked from self-styled reformers from on high

…oh sorry…

…just savoring the moment.

We are still in what I view as the early stages of divorcing ourselves from the entirely indefensible practice of treating teachers like interchangeable widgets. We have a great deal to learn, and may need to develop a reliable system of third-party academic assessment as we seek to attach greater consequences to student learning gains if techniques like erasure analysis ultimately fall short. Rather than an argument for the status-quo, this is all the more reason to get on with it.

The debate hasn’t been hijacked Randi, it’s been won fair and square.


Governor Mitch Daniels Lays Out His Education Vision at AEI

May 4, 2011

(Guest Post by Matthew Ladner)

Governor Daniels lays out his education reforms at an American Enterprise Institute. If Indiana can sustain these reforms with prolonged high-quality implementation, they can become the new Florida. Indiana 2011 stands as the best reform session since Florida 1999 in my book.


Follow

Get every new post delivered to your Inbox.

Join 2,366 other followers