Grading New York

November 13, 2008

(Guest post by Greg Forster)

Our old friend and colleague Marcus Winters has just released a study on New York City’s school grading program:

In 2006-07, New York City, the largest school district in the United States, decided it would follow several other school systems in adopting a progress report program. Under its program, the city grades schools from A to F according to an accumulating point system based on the weighted average of measurements of school environment, students’ performance, and students’ academic progress.

The implementation of these progress reports has not been without controversy. While many argue that they inform parents about public school quality and encourage schools to improve, others contend that grades lower morale at low-performing schools. To date there has been too little empirical information about the program’s effectiveness to settle these questions.

Schools that recieve D and F grades repeatedly are subject to takeover by the city. A previous study (Rockoff and Turner 2008) found positive results from the program but lacked student-level data. Marcus’s study has got student-level data, regression discontinuity – the whole smash. Tale of the tape:

Students in schools earning an F grade made overall improvements in math the following year, though these improvements occurred primarily among fifth-graders.

Students in F-graded schools did no better or worse in English than students in schools that were not graded F.

Whatever problems NCLB may have, school accountability does work in places where state and local government have the political will to do it seriously. Even in places where the problems seem intractible, like New York City.

EMTs are standing by in case certain people’s heads explode.


Looking Abroad for Hope

November 5, 2008

hope

HT despair.com. Looking for a Christmas idea to suit the new reality? Why not a despair.com gift certificate – “For the person who has everything, but still isn’t happy.”

(Guest post by Greg Forster)

Looking around for something to give me hope this morning, I find the best place to turn (for today, at least) is outside the U.S. Specifically, I turn to the recently released study in Education Next by Martin West and Ludger Woessmann finding that around the world, private school enrollment is associated with improved educational outcomes in both public and private schools, as well as lower costs.

Well-informed education wonks will say, “duh.” A large body of empirical research has long since shown, consistently, that competition improves both public school and private school outcomes here in the U.S., while lowering costs. And the U.S. has long been far, far behind the rest of the world in its largely idiosyncratic, and entirely irrational, belief that there’s somthing magical about a government school monopoly.

And private school enrollment is an imperfect proxy for competition. It’s OK to use it when it’s the best you’ve got. I’ve overseen production of some studies at the Friedman Foundation that used it this way, and I wouldn’t have done that if I didn’t think the method were acceptable. However, that said, it should be remembered that some “private schools” are more private than others. In many countries, private school curricula are controlled – sometimes almost totally so – by government. And the barriers to entry for private schools that aren’t part of a government-favored “private” school system can be extraordinary.

That said, this is yet another piece of important evidence pointing to the value of competition in education, recently affirmed (in the context of charter schools, but still) by Barack Obama. Who I understand is about to resign his Senate seat – I guess all those scandals and embarrasing Chicago machine connections the MSM kept refusing to cover finally caught up with him.


The Upward Surge of Mankind?

October 30, 2008

(Guest Post by Matthew Ladner)

Florida tripled the number of Hispanic and African American students passing one or more AP exams with a program that included a financial incentive for schools and teachers.

Meanwhile, C. Kirabo Jackson finds positive results for a similar Texas pilot program in Education Next:

According to my assessment, the incentives produce meaningful increases in participation in the AP program and improvements in other critical education outcomes. Establishment of APIP results in a 30 percent increase in the number of students scoring above 1100 on the SAT or above 24 on the ACT, and an 8 percent increase in the number of students at a high school who enroll in a college or university in Texas. My evidence suggests that these outcomes are likely the result of stronger encouragement from teachers and guidance counselors to enroll in AP courses, better information provided to students, and changes in teacher and peer norms.

Gordon Gekko for Secretary of Education? I can see the confirmation hearing speech in my head:

The point is, ladies and gentleman, that greed — for lack of a better word — is good.

Greed is right.

Greed works.

Greed clarifies, cuts through, and captures the essence of the evolutionary spirit.

Greed, in all of its forms — greed for life, for money, for love, knowledge — has marked the upward surge of mankind.

And greed — you mark my words — will not only save public education, but that other malfunctioning corporation called the USA.

Just kidding, but I will say this: we need to continue experimenting with programs like this. They certainly seem to beat throwing money at schools in the hope that they will improve.

 


Modest Programs Produce Modest Results . . . Duh.

September 3, 2008

HT perfect stranger @ FR

By Greg Forster & Jay Greene

Edwize is touting a new “meta-analysis” by Cecilia Rouse and Lisa Barrow claiming that existing voucher programs produce only modest gains in student learning.

Edwize quotes the National Center for the Study of Privatization in Education (NCSPE), which sponsored the paper and is handling its media push, describing the paper as a “comprehensive review of all the evaluations done on education voucher schemes in the United States.”

But the paper itself says something different: “we present a summary of selected findings…” (emphasis added). Given EdWize’s recent accusations about cherry-picking research, which are repeated in his post on the Rouse/Barrow paper, we thought he’d be more sensitive to the difference between a comprehensive review of the reserach and a review that merely presents selected findings. (By contrast, the reviews we listed here are comprehensive.)

Even more important, the Rouse/Barrow paper provides no information on the criteria they used to decide which voucher experiments, and which analyses of those experiments, to include among the “selected findings” they present, and which to exclude from their review. The paper includes participant effect studies from Milwaukee, Cleveland, DC, and New York City, but does not include very similar studies conducted on programs in Dayton or Charlotte. In New York it includes analyses by Mayer, Howell and Peterson, as well as Krueger and Zhu, but not by Barnard, et al. The paper includes systemic effect analyses from Milwaukee and Florida, but excludes analyses by Howell and Peterson as well as by Greene and Winters.

Clearly this paper is not intended to be, and indeed it does not even profess to be, a comprehensive review.

But even with its odd and unexplained selection of studies to include and exclude, Rouse and Barrow’s paper nevertheless finds generally positive results. They identified 7 statistically significant positive participant effects and 4 significant negative participant effects (all of which come from one study: Belfield’s analysis of Cleveland, which is non-experimental and therefore lower in scientific quality than the studies finding positive results for vouchers). In total, 16 of the 26 point estimates they report for participant effects are positive.

On systemic effects, they report 15 significant positive effects and no significant negative effects. Of the 20 point estimates, 16 are positive.

And yet they conclude that the evidence “is at best mixed.” If this were research on therapies for curing cancer, the mostly positive and often significant findings they identified would never be described as “at best mixed.” We would say they were encouraging at the very least.

Moreover, the paper is not, and doesn’t claim to be, a “meta-analysis.” That term doesn’t even appear anywhere in the paper. It’s really just a research review, as the first sentence of the abstract clearly states (“In this article, we review the empirical evidence on…”). It looks like the term “meta-analysis,” like the phrase “comprehensive review,” was introduced by the NCSPE’s publicity materials.

What’s the difference? A meta-analysis performs an original analysis drawing together the data and/or findings of multiple previous studies, identified by a comprehensive review of the literature. The “conclusions” of a research review are just somebody’s opinion. Meta-analyses vary from simple (counting up the number of studies that find X and the number that find Y) to complex (using statistical methods to aggregate data or compare findings across studies). But what they all have in common is that they present new factual knowledge. A research review produces no new factual knowledge; it just states opinions.

There’s nothing wrong with researchers having opinions, as we have argued many times. It’s essential. But it’s even more essential to maintain a clear distinction between what is a fact and what is somebody’s opinion. Voucher opponents, as the saying goes, are entitled to their own opinions but not their own facts. (Judging by the way they conduct themselves, this may be news to some of them – for example, see Greg Anrig’s claims in the comment thread here.)

By falsely puffing this highly selective research review into a meta-analysis, NCSPE will decieve some people – especially journalists, who these days are often familiar with terms like “meta-analysis” and know what they mean, even if NCSPE doesn’t – into thinking that an original analysis has been performed and new factual knowledge is being contributed, when in fact this is just a repetition of the same statement of opinion that voucher opponents have been offering for years.

(We don’t blame Edwize for repeating NCSPE’s falsehood; there’s no shame in a layman not knowing the proper meaning of the technical terms used by scholars.)

And what about the merits of the opinion itself? The paper’s major claim, that the benefits of existing voucher programs are modest, is exactly what we have been saying for years. For example, in this study one of us wrote that “the benefits of school choice identified by these studies are sometimes moderate in size—not surprising, given that existing school choice programs are restricted to small numbers of students and limited to disadvantaged populations, hindering their ability to create a true marketplace that would produce dramatic innovation.”

And there’s the real rub. Existing programs are modest in size and scope. They are also modest in impact. Thank you, Captain Obvious.

The research review argues that because existing programs have a modest impact, we should be pessimistic about the potential of vouchers to improve education dramatically either for the students who use them or in public schools (although the review does acknowledge the extraordinary consensus in the empirical research showing that vouchers do improve public schools).

But why should we be pessimistic that a dramatic program would have a dramatic impact on grounds that modest programs have a modest impact?

One of us recently offered a “modest proposal” that we try some major pilot programs for the unions’ big-spending B.B. approach and for universal vouchers (as opposed to the modest voucher programs we have now), and see which one works. He wrote: “Better designed and better funded voucher programs could give us a much better look at vouchers’ full effects. Existing programs have vouchers that are worth significantly less than per pupil spending in public schools, have caps on enrollments, and at least partially immunize public schools from the financial effects of competition. If we see positive results from such limited voucher programs, what might happen if we could try broader, bolder ones and carefully studied the results?”

Has Edwize managed to respond to that proposal yet? If he has, we haven’t seen it. Come on – if you’re really as confident as you profess to be that your policies are backed up by the empirical research and ours are not, what are so you afraid of?

And while we’re calling him out, here’s another challenge: in the random-assignment research on vouchers, the point gains identified for vouchers over periods of four years or less are generally either the same size as or larger than the point gains identified over four years for reduced class sizes in the Tennessee STAR experiment. Will Edwize say what he thinks of the relative size of the benefits identified from existing voucher programs and class size reduction in the empirical research?


The Meta-List: An Incomplete List of Complete Lists

August 27, 2008

“The Treason of Images,” Rene Magritte, 1928-29 (“This is not a pipe.”)

(Guest post by Greg Forster)

Jay posted two “complete lists” of voucher research this week, and a number of people seem to have found them helpful. Jay and I have both spent a lot of time circulating these lists for years (they change over time, of course, as new research gets done). We keep on thinking we’ve circulated these lists so much that there can’t be much use in circulating them further, yet we keep on finding more people who say, “Wow, I’ve never seen anything like this before, this is really helpful!”

Well, if people found those two lists helpful, maybe they’d like to see some of the other lists that have been compiled. So here’s a meta-list: a list of complete lists of research.

Of course, this is not a complete list of the complete lists. If anyone wants to add more in the comment section, that will help make this page even more useful. And I’ll come back and update the list as needed, so that this page will remain a useful resource for people looking for all the research on vouchers.

Though no doubt others will think that my list of complete lists isn’t nearly complete enough. I hope they’ll compile their own lists of complete lists – the more the merrier. And when there are enough lists of complete lists out there, we’ll need to make a list of them, so that people can keep track of them all . . .

Of course, these lists are all “complete to my knowledge.” There may always be a study lurking out there that hasn’t been noticed – although on the voucher issue that’s a somewhat more remote possibility than it is with other issues.

Last year I made an effort to summarize all the research on all the issues relating to vouchers in this study. The sections covering random-assignment studies of voucher participants and studies of how vouchers affect public schools are now out of date, but the report will point you to a bunch of other studies on issues that don’t have enough of a body of research – or have too much of a body of research – to generate a “complete list.” For example, you’ll find a discussion of the evidence on questions like the fiscal impact of voucher programs, and whether vouchers provide all students with access to schooling.

On those last two subjects – fiscal impacts and whether the private school sector provides broad, inclusive access to schooling for all students – the Friedman Foundation offers handy guides (here and here) and references to the research issues (here and here).

And finally, here is a meta-list that will point you to a bunch of complete lists of research on issues related to vouchers. Personally, I’ve found this resource to be the most helpful of all.

NOTE: This post is edited as needed to keep it up to date.


Yet Another Study Finds Vouchers Improve Public Schools

August 21, 2008

(Guest post by Greg Forster)

The Friedman Foundation has just released my new study showing that Ohio’s EdChoice voucher program had a positive impact on academic outcomes in public schools. I’m told that it has generated a number of news hits, though the only reporter to interview me so far was the author of this piece in the Columbus Dispatch. When she interviewed me I thought she was hostile, because her questions put me a little off balance, but the article is perfectly fair. I guess if the reporter is doing her job right, the interviewees ought to feel like they were being challenged. The final product is what counts.

The positive results that I found from the EdChoice program were substantial but not revolutionary. That’s not surprising, given that 1) failing-schools vouchers aren’t the optimum way to structure voucher programs in the first place, and 2) the data were from the program’s first year, when it was smaller and more restricted than it is now.

It’s too early to be sure, but among the large body of empirical studies consistently showing that vouchers improve public schools, a pattern seems to be emerging that voucher programs have a bigger impact on public schools when they’re larger, more universal, and have fewer obstacles to parental participation. That’s worth watching and studying further as opportunities arise.


Teacher Contracts: Blame States, Too

July 30, 2008

(Guest post by Greg Forster)

The National Council on Teacher Quality has published a new report on the sausage-factory process behind teacher contracts. (HT EIA, or as I like to call him, ALELR.)

Readers of Jay P. Greene’s Blog will probably not need to be told that reformers have long identified teacher contracts as one of the most important root causes, if not the single most important root cause, of the system’s ills. It is because of these contracts, for example, that pay scales, quality control and disciplinary procedures in education resemble those of a factory (even a factory circa 1965) more than those of a profession.

Defenders of the system sometimes argue that teachers should recieve the deference that is due to professionals. Personally, I’d love to see that – but not until they’re compensated and held accountable like professionals.

When that day comes, teachers will be able to say, “Now we have freedom and responsibility. It’s a very groovy time!”

Until then, you can’t expect to have one without the other for very long. The universe doesn’t work that way.

Reformers have long argued that the fundamental problem is disproportionate union influence on school boards. Union members have a much stronger motive to vote in school board elections than anyone else, especially when the elections are held separately and require a special trip to the polls. Thus, at contract renewal time the union ends up “negotiating with itself.”

However, the NCTQ report’s main argument seems to be that we should be griping less about the actual bargaining process between districts and unions, and more about the laws passed by state legislatures mandating certain provisions in those contracts. The unions find it easier to extract what they want in the statehouse, NCTQ argues: “As unions have matured, their leaders have realized that it is more efficient to lobby state legislatures on particular provisions than to negotiate district by district every few years as contracts expire.”

The report collects a lot of useful information on the subject, and any contribution to knowledge on this badly understudied subject is valuable. And clearly NCTQ is right when it observes that bad state mandates ought to be deplored alongside bad district/union negotiations, and they currently aren’t.

But if I may play devil’s advocate (“When don’t you?” the unions may ask), I think NCTQ overstates its case on the importance of state mandates vis-a-vis district negotiations.

The report’s opening concedes that “the teacher contract still figures prominently on such issues as teacher pay,” but asserts that “on the most critical issues of the teaching profession, the state is the real powerhouse,” citing how teachers are evaluated, when they get tenure, their benefits, and the notorious issue of firing procedures. But are benefits really that important as an obstacle to reform, so long as compensation is structured on a factory-worker scale? And does the procedure for evaluating teachers matter as an obstacle to reform so long as evaluations play no role in compensation – again because compensation is structured on a factory-worker scale? When teachers get tenure and how hard it is to fire the bad ones are obviously important as obstacles to reform. But are they really so much more important than the factory-worker scale? Whether teachers get tenure early or late is less important than the fact that they get it. Disciplinary procedures only affect a small number of teachers. Even if we include the absence of a more widespread deterrent effect, we’re still not talking about something that affects all or even most of the profession.

I also think NCTQ is barking up the wrong tree when it argues that lobbying the state for goodies is more “efficient” than fighting for goodies district by district. As Hamilton, Madison and Jay (the “Three Founderos”) observed in the Federalist Papers, selfish interests will always find it easier to extract goodies from the public fisc in a whole bunch of little local places than in one big place. While centralization does provide one-stop shopping, it also creates more intense scrutiny and greater opporutnities for opposition.

In fact, in the case of the teachers’ unions, I’m not even sure why it would take more resources to extract goodies on a district-by-district basis. They have to “negotiate district by district” anyway. They get coerced dues payments from millions of teachers precisely to pay the costs of negotiating in every district. And conditions on the ground in those districts are more favorable than those in the statehouse.

Moreover, the old saying goes “the crime is what’s legal.” In this case, the big obstacle to reform is what the teachers don’t have to bother negotiating for: the factory-worker structure of compensation. It’s not like they have to go back and win that all over again every time the contract comes up for renewal.

Finally, it’s not clear that state-mandated and district-negotiated provisions can be separated all that clearly. For example, check out this chart from the NCTQ report, illustrating how the process for firing teachers is mandated by state law in California:

Pretty nice graphic! But check out the contents of the first box:

School district must document specific examples of ineffective performance, based on standards set by the district and the local teachers union.

And the third box:

If the school board votes to approve dismissal . . .

And the fifth box:

School board must reconvene to decide whether to proceed . . .

And the seventh box:

. . . and persons appointed by the school board . . .

And the ninth box:

If . . . the school district appeals the decision . . .

See what I mean? The larger reality of the union/school board relationship will influence the board’s behavior in discipline cases. And the standards for documenting misconduct are subject to union/board negotiations.

I don’t mean to diminish NCTQ’s important contribution here. We should absolutely be paying more attention to state teacher contract mandates. But I think NCTQ goes too far to argue that they’re more important than the dysfunctional school board system.


Being Misquoted

July 17, 2008

Dean Millot has a new post attacking me on the peer review issue that Eduwonkette promotes on her own site.

But Dean Millot is being fundamentally dishonest in that he misquotes me. He says that I argue: “In short, I see no problem with research becoming public with little or no review.”

In fact I wrote: “In short, I see no problem with research initially becoming public with little or no review.” (See here )

The absence of the word “initially” makes quite a difference and sets up the straw man that Millot wishes to knock down. The issue is not whether research can benefit from peer review, but whether it is inappropriate to make it publicly available INITIALLY, before it has received peer review.

Readers may want to wonder about the credibility of Millot’s claim that “One of the reasons I do my best to quote the very words of people I write about in edbizbuzz is that I prefer to fight fair.”

And so much for Eduwonkette’s praise of Millot’s “measured, careful, and thoughtful analysis.”

I’m waiting for the correction and apology from both of them.


It Never Ends

July 14, 2008

I thought that the exchange with Eduwonkette over the appropriateness of releasing research without peer review had run its course with my last post.  But it seems that it will never end.  Here is her latest post and here is the reply that I posted in her comment section:

Eduwonkette is attempting to change the subject. I’ve never disputed that peer review can help provide additional assurances to readers about quality.  The issue is whether research ought to be available to the public even if it has not been peer reviewed.  In attacking the release of my most recent study Eduwonkette seems to be arguing that it is inappropriate to release research without peer review, at least under certain conditions that she only applies to research whose findings she does not like.  If she were going to be consistent, she would have to criticize anyone who releases working papers of their research, which would be almost everyone doing serious research.

 

What’s more, she is still trapped in a contradiction: she can’t say that we should analyze the motives of people who release research directly to the public when assessing whether it is appropriate, while she prevents analysis of her own motives because she blogs anonymously.  As I have now said several times, either she drops the suggestion that we analyze motives or she drops her role as an anonymous blogger.  If she refuses to resolve this contradiction, Ed Week should stop lending her their reputation by hosting her blog.  Let her be inconsistent in blogging at the expense of her own anonymous persona and not drain the respectability of Ed Week.

 

Lastly, the comparison of the market for education policy information and the market for cars comes from my most recent post in our exchange, but she oddly does not credit me here. (See https://jaypgreene.com/2008/07/12/see-were-in-italy/ )  Her position seems to be that we ought to forbid (or at least shun) the sale of used cars without warranties (translation: research without peer review).  My argument is that used cars without warranties come at a risk but there are compensating benefits.  Similarly, non-peer-reviewed research has its risks but also its benefits.

 

UPDATE — My exchange with Eduwonkette continues although it seems increasingly pointless.  Here is my (slightly edited) last comment on her site:

“Let’s make this very concrete. Was it inappropriate for Marcus Winters and I to release our social promotion findings in 2004 without peer review, or should we have waited until it had been peer-reviewed and published (in various forms) in 2006, 2007, and again in 2008? If the appropriate thing is to wait, would interest groups, editorial boards, and bloggers similarly hold their tongues until the additional evidence came in?  Would policymakers hold off on decisions that might have come out differently if they had the suppressed information?

Would it have been OK to release in 2004 as long as we tried to make it obscure enough so that people were less likely to find it? What if interest groups, bloggers, etc… found our obscure finding and promoted them (as has happened with Jesse Rothstein’s paper)?

And in saying ‘working papers and thinktank reports are released for entirely different functions’ you are repeating your call for an analysis of motives. You’ve said that think tanks want to influence policy (bad motive) while academics are trying to advance knowledge with each other (good motive). But if academics are serving the public good, shouldn’t they ultimately want to influence policy? I am an academic who also releases working papers through a think tank. Does that make my motives good or bad? I think all of this analysis of motives is silly when the real issue is the truth of claims, not why people are making those claims. Calling for an analysis of motives is especially silly for someone who is trying to influence people anonymously. The fact that you are trying to influence people through a blog does not give you a free pass from having to be consistent on this.”


See, we’re in Italy…

July 12, 2008

Stripes

“See, we’re in Italy.  The guy on the top bunk has gotta make the guy on the bottom’s bed all the time.  It’s in the regulations.  If we were in Germany I would have to make yours.  But we’re in Italy, so you’ve
gotta make mine. It’s regulations.”

This is more or less Eduwonkette’s response to my complaint that she can’t argue that the source of information is important in assessing the truth of claims while blogging anonymously.  Her answer is that it’s different for bloggers (in Italy) than for researchers (in Germany).  It’s regulations.

She goes on to describe some differences between different types of information in education policy debates, but it’s not clear why any of those differences would be relevant to whether assessing the source is important for one and not for another.  The closest she comes to explaining why things are meaningfully different is when she says, “And let’s be realistic: an anonymous blogger isn’t shaping public policy.”  So, if information will have no bearing on policy debates, then its source is unimportant.

This would be a consistent argument if she really believed that bloggers had NO influence.  But of course they have at least some influence.  Why else would she and the rest of us be bothering with this?  And if bloggers have some influence, then the same basic principles should apply: either we should analyze the motives of sources of information to assess the truth of claims or we shouldn’t.  I’m in favor of not analyzing motives for anyone since I think that the truth of claims is independent of the motives of the source.  Even bad people can make true arguments.

At the risk of belaboring this issue, maybe I can clarify things by describing the market of ideas in policy debates as being like the market for cars.  We have different levels of confidence in cars that have gone through different processes before being made available for sale.  We could buy a used car from the corner used car dealer with no warranty.  That would be like reading blogs.  We don’t really know whether we are getting a lemon or not, since almost no assurances have been made about quality.  Or we could buy a used car from a larger chain with at least some warranty.  That would be like getting information from newspapers or magazines.  There has been some review and assurance of quality, but we still don’t quite know what we’ll get.  Or we could buy a new car from a major dealer and buy the extended warranty.  That would be like getting information from a peer-reviewed journal.  It may still be a lemon, but we’ve received a lot of assurances that it is not.  And I suppose reading an anonymous blogger is like buying a used car from someone you don’t know in the want ads.  There are trade-offs in getting cars with these different level of assurances about quality, just as there are trade-offs in getting information that has gone through different processes to assure quality. 

Eduwonkette’s argument is essentially that the same rules regarding these trade-offs don’t apply to the market for cars without warranties that do apply to the market for cars with warranties.  My view is that there are only differences in degree, not kind.  Even bad people can sell cars that are good values.

I’ve also noticed that Marc Dean Millot has weighed in on this issue.  He’s just knocking down a straw man.  It is not my position that research doesn’t benefit from peer review.  He can check out my cv to see that I have two dozen peer-reviewed publications, many of which were earlier released directly to the public without review.

I’ve been arguing that the public benefits from seeing research even before it has received peer review because it gets more information faster.  Without the assurances of peer review people will tend to have lower confidence in that research, and their confidence may increase as the research receives those additional assurances.  Millot seems to want to embargo information from the public until it receives peer review.  If he really believes that, then he should criticize every researcher with working papers on the web.  That’s almost everyone doing serious research.

And on his points about ideology tainting research I would suggest that people read Greg Forster’s excellent earlier post on Vouchers: Evidence and Ideology.