Eduresponses to Edubloggers

July 10, 2008

My recent posts on the release of our new study on the effects of high-stakes testing in Florida and posts here and here on the appropriateness of releasing it before it has appeared in a scholarly journal, have produced a number of reactions.  Let me briefly note and respond to some of those reactions.

First, Eduwonkette, who started this all, has oddly not responded.  This is strange because I caught her in a glaring contradiction: she asserts that the credibility of the source of information is an important part of assessing the truth of a claim yet her anonymity prevents everyone from assessing her credibility.  I prefer that she resolve this contradiction by agreeing with my earlier defense of her anonymity that the truth of a claim is not dependent on who makes it.  But she has to resolve this one way or another — either she ends her anonymity or she drops the argument that we should assess the source when determining truth.

But apparently she doesn’t have to do anything.  Whose reputation suffers if she refuses to be consistent?  Her anonymity is producing just the sort of irresponsibility that Andy Rotherham warned about in the NY Sun and that I acknowledged even as I defended her.  The only reputation that is getting soiled is that of Education Week for agreeing to host her blog anonymously.  If she doesn’t resolve her double-standard by either revising her argument or dropping her anonymity, Education Week should stop hosting her.  They shouldn’t lend their reputation to someone who will tarnish it.

Mike Petrilli over at Flypaper praises our new study on high stakes testing but takes issue with referencing comments by Chester Finn and Diane Ravitch about how high stakes is narrowing the curriculum in the “pre-release spin.”  I agree with him that this study is not “the last word on the ‘narrowing of the curriculum.’”  But to the extent that it shows that another part of the curriculum (science) benefits when stakes are applied only to math and reading, it alleviates the concerns Checker and Diane have expressed. 

As we fully acknowledge in the study, we don’t have evidence on what happens to history, art, or other parts of the curriculum.  And we only have evidence from Florida, so we don’t know if there are different effects in other states.  But the evidence that high stakes in math and reading contribute to learning in science should make us less convinced that all low stakes subjects are harmed.  Perhaps school-wide reforms that flow from high stakes in math and reading produce improvements across the curriculum.  Perhaps improved basic skills in literacy and numeracy have spill-over benefits in history, art, and everything else as students can more effectively read their art texts and analyze data in history.

Andy Rotherham at Eduwonk laments that what I describe as our “caveat emptor market of ideas” doesn’t work very well.  I agree with him that people make plenty of mistakes.  But I also agree with him that “in terms of remedies there is no substitute for smart consumption of information and research…”  There is no Truth Committee that will figure everything out for us.  And any process of reviewing claims before release will make its own errors and will come at some expense of delay.  Think Tank West has added some useful points on this issue.

Sherman Dorn, who rarely has a kind word for me, says: “Jay Greene (one of the Manhattan Institute report’s authors and a key part of the think tank’s stable of writers) replied with probably the best argument against eduwonkette(or any blogger) in favor of using PR firms for unvetted research: as with blogs, publicizing unvetted reports involves a tradeoff between review and publishing speed, a tradeoff that reporters and other readers are aware of.”  He goes on to have a very lengthy discussion of the issue, but I was hypnotized by his rare praise, so I haven’t yet had a chance to take in everything else he said.


Eduwonkette Apologizes

July 8, 2008

I appreciate Eduwonkette’s apology posted on her blog and in a personal email to me.  It is a danger inherent in the rapid-fire nature of blogging that people will write things more strongly and more sweeping than they might upon further reflection.  I’ve already done this on a number of occasions in only a few months of blogging, so I am completely sympathetic and un-offended.

One could argue that these errors demonstrate why people shouldn’t write or read blogs.  In fact some people have argued that ideas need a process of review and editing before they should be shown to the public.  These people tend to be ink-stained employees of “dead-tree” industries or academia, but they have a point: there are costs to making information available to people faster and more easily.

Despite these costs the ranks of bloggers and web-readers have swelled.  There are even greater benefits to making more information available to more people, much faster than the costs of doing so.  People who read blogs and other material on the internet are generally aware of the greater potential for error, so they usually have a lower level of confidence in information obtained from these sources than from other sources with more elaborate review and editing processes.  Some material from blogs eventually finds its way into print and more traditional outlets, and readers increase their confidence level as that information receives further review.

Of course, the same exact dynamics are at work in the research arena.  Releasing research directly to the public and through the mass media and internet improves the speed and breadth of information available, but it also comes with greater potential for errors.  Consumers of this information are generally aware of these trade-offs and assign higher levels of confidence to research as it receives more review, but they appreciate being able to receive more of it sooner with less review.

In short, I see no problem with research initially becoming public with little or no review.  It would be especially odd for a blogger to see a problem with this speed/error trade-off without also objecting to the speed/error trade-offs that bloggers have made in displacing newspapers and magazines.  If bloggers really think ideas need review and editing processes before they are shown to the public, they should retire their laptops and cede the field to traditional print outlets. 

We have a caveat emptor market of ideas that generally works pretty well.

So it was disappointing that following Eduwonkette’s graceful apology, she attempted to draw new lines to justify her earlier negative judgment about our study released directly to the public.  She no longer believes that the problem is in public dissemination of non-peer-reviewed research.  She’s drawn a new line that non-peer-reviewed research is OK for public consumption if it contains all technical information, isn’t promoted by a “PR machine,” isn’t “trying to persuade anybody in particular of anything,” and is released by trustworthy institutions.

The last two criteria are especially bothersome because they involve an analysis of motives rather than an analysis of evidence.  I defended Eduwonkette’s anonymity on the grounds that it doesn’t matter who she is, only whether what she writes is true.  But if Eduwonkette believes that the credibility of the source is an important part of assessing the truth of a claim, then how can she continue to insist on her anonymity and still expect her readers to believe her.  How do we know that she isn’t trying to persuade us of something and isn’t affiliated with an untrustworthy institution if we don’t know who she is?  Eduwonkette can’t have it both ways.  Either she reveals who she is or she remains consistent with the view that the source is not an important factor in assessing the truth of a claim.

No sooner does Eduwonkette establish her new criteria for the appropriate public dissemination of research than we discover that she has not stuck to those criteria herself.  Kevin DeRosa asks her in the comments why she felt comfortable touting a non-peer-reviewed Fordham report on accountability testing. That report was released directly to the public without full technical information, was promoted by a PR machine, comes from an organization that is arguably trying to persuade people of something and whose trustworthiness at least some people question.

So, she articulates a new standard: releasing research directly to the public is OK if it is descriptive and straightforward.  I haven’t combed through her blog’s archives, but I am willing to bet that she cites more than a dozen studies that fail to meet any of these standards.  Her reasoning seems ad hoc to justify criticism of the release of a study whose findings she dislikes.

Diane Ravitch also chimes in with a comment on Eduwonkette’s post: “The study in this case was embargoed until the day it was released, like any news story. What typically happens is that the authors write a press release that contains findings, and journalists write about the press release. Not many journalists have the technical skill to probe behind the press release and to seek access to technical data. When research findings are released like news stories, it is impossible to find experts to react or offer ‘he other side,’ because other experts will not have seen the study and not have had an opportunity to review the data.”

Diane Ravitch is a board member of the Fordham Foundation, which releases numerous studies on an embargoed basis to reporters “like any news story.”  Is it her position that this Fordham practice is mistaken and needs to stop?


Weekend PJM Column

June 25, 2008

(Guest post by Greg Forster)

I was out of town earlier this week and didn’t get a chance to post a link to my Pajamas Media column on the D.C. voucher evaluation, which ran over the weekend. It’s here.


What Does the Red Pill Do If I Don’t Take It?

June 19, 2008

 

(Guest Post by Matthew Ladner)

The hidden highlight from the Evaluation of the DC Opportunity Scholarship Program: Impacts After Two Years report is buried in the Appendix, pp. E-1 to E-2:

Applying IV analytic methods to the experimental data from the evaluation, we find a statistically significant relationship between enrollment in a private school in year 2 and the following outcomes for groups of students and parents (table E-1):

• Reading achievement for students who applied from non-SINI schools; that is, among students from non-SINI schools, those who were enrolled in private school in year 2 scored 10.73 scale score points higher (ES = .30)^2 than those who were not in private school in year 2.

• Reading achievement for students who applied with relatively higher academic performance; the difference between those who were and were not attending private schools in year 2 was 8.36 scale score points (ES = .24).

• Parents’ perceptions of danger at their child’s school, with those whose children were enrolled in private schools in year 2 reporting 1.53 fewer areas of concern (ES = -.45) than those with children in the public schools.

• Parental satisfaction with schooling, such that, for example, parents are 20 percentage points more likely to give their child’s school a grade of A or B if the child was in a private school in year 2.

• Satisfaction with school for students who applied to the OSP from a SINI school; for example, they were 23 percentage points more likely to give their current school a grade of A or B if it was a private school.

I’m trying to figure out why the impact of actually using the voucher program isn’t actually the focus of this study, and in fact is presented in an appendix. Instead all the “mixed” results are studying the impact of having been offered a scholarship whether the student actually used it or not.

I’m going to walk way out on a limb here and predict that the impact on test scores of being offered but not using a voucher will be indistinguishable from zero. If this were a medical study, we would have a group of patients in a control and experimental group offered a drug, some of them choose not to take it, but we ignore that fact and measure the impact of the drug based on the results of both those who took it and those who didn’t. Holding the pill bottle can’t be presumed to have the same impact as taking the pills.

We’ve all been told that exercise is good for our health. Should we judge the effectiveness of exercise on health outcomes by what happens to those who actually exercise, or by the results for everyone that has been told that it is good for you?

This shortcoming has been corrected in the Appendix, but that is getting very little attention. On page 24 the evaluation reads:

Children in the treatment group who never used the OSP scholarship offered to them, or who did not use the scholarship consistently, could have remained in or transferred to a public charter school or traditional DC public school, or enrolled in a non-OSP-participating private school.

So in the report’s main discussion, the kids actually attending private schools have to make gains big enough to make up for the fact that many “treatment” kids are actually back in DCPS. As it turns out, several subsets of students do make such gains, but that’s not the point. The point is we ought to be primarily concerned with whether actual utilization of the program improves education outcomes and with systemic effects of the program. We should indeed study who actually uses this program, and who chooses not to and the reasons why (very important information), but this sort of analysis seems to belong in the appendix rather than the other way around.

Receiving an offer of a school voucher doesn’t constitute much of an education intervention, and it seems painfully obvious that the discussion around this report is conflating the impact of voucher offers with that of voucher use. The impact of voucher use is clear and positive.


The SAT and College Grades

June 18, 2008

(Guest post by Larry Bernstein)

Yesterday, the College Board released a study of the predicative power of the SAT to estimate a student’s college freshman year grade point average. A Bloomberg article condemned the results because of the relative ineffectiveness of the new SAT to predict college grades. The predictive power of the SAT is trivially improved by the addition of the new essay exam which adds test time and is costly to grade.

I think this should come as no surprise, and it shows the general limitations of using standardized tests to predict college grades. One of the key points made in the study is that high school grades are a better predictor versus the SAT. High school grades need to be included with the SAT to best estimate GPA. 

In my 1985 Wharton undergraduate statistics class, each student was required to create a regression research project. By chance, I chose to research predicting my classmate’s college GPA. I used 20 variables, including the SAT score, and I found only 5 variables with statistical significance: SAT score, number of hours studied, Jewish or Gentile, Wharton or other school such as the college of arts and sciences, and raised in the Northeast or elsewhere.

Similar to the national studies, in my survey of 100 fraternity brothers the SAT score did a mediocre job of predicting college GPA as a single variable. The key variable in my study was the number of hours studied. You would be surprised by the variance in Ivy Leaguers’ study habits. My survey asked students to estimate the number of hours as 1-10, 10-20, 20-30, or 30-40.  My favorite response was: “Is this per semester?” I assumed the student would realize it was per week! Work habits and effort played a critical role in estimating college GPA. Obviously, the college placement office will have difficulty estimating this variable, though difficulty of course load and number of AP classes might help.

The rest of the variables seem obvious. It is much more difficult to get into Wharton than the other programs at Penn. So it is no surprise that Wharton students were running circles around the non-Wharton students, even adjusting for SAT scores and hours studied. In addition, it is much more difficult to get into Penn from the NE than from other areas of the country.

Very few of the Jews were jocks. Needless to say my college fraternity had plenty of sample problems.


The DC Voucher Evaluation

June 16, 2008

(Guest post by Greg Forster)

Today the U.S. Dept. of Education released the fourth annual report on the random-assignment evaluation of the DC voucher program, including academic results for the first two years of the program’s existence. As with last year’s report, across the whole population the voucher students had higher academic outcomes than the control group, but the positive results just barely fell short of the conventional cutoff for statistical certainty. This means that while the voucher students in fact had higher test scores, we cannot be 95 percent confident that their higher scores are due to vouchers and not a statistical fluke. This year it was the reading results that came close to statistical significance, reaching 91 percent certainty. The study also finds statistically certain positive results for three subgroups, which together comprise 88 percent of the voucher population.

Since the previous year’s results were also not statistically significant, this update of the study doesn’t change the balance of the studies on school choice. As before, there are a total of ten random-assignment studies on school vouchers, all ten of which found that the voucher students had higher academic achievement, with eight studies achieving statistical certainty for the positive finding and two not.

In other words, school vouchers are still better supported by high-quality scientific evidence than any other education policy. If you reject vouchers because this study is only 91 percent sure they produce academic improvements, you have no empirical grounds for supporting any other policy, since all other policies are far less well supported by empirical evidence than vouchers.

In a few minutes you’ll be able to see the Friedman Foundation’s response to the DC study, including details and citations on all ten random-assignment studies of vouchers, here.


Grad Rates Higher in Milwaukee Voucher Program

May 31, 2008

In case anyone missed the release of this study this week, Rob Warren of the University of Minnesota has a new study comparing high school graduation rates in Milwaukee’s voucher program and public schools.  The bottom line is that students graduate at much higher rates in the voucher program. 

Warren is careful to emphasize that he cannot draw causal inferences from this work.  That is, the voucher students graduate at higher rates than public students, but he can’t say whether the voucher program caused their higher graduation rate.  That kind of conclusion can only be drawn from a study that compares apples to apples.  With Pat Wolf I am involved in an evaluation that will be able to produce a graduation rate comparison of matched samples of voucher and public students, but results are still a few years down the road.


Strawman — er, I mean — Strawperson

May 22, 2008

The American Association of University Women released a report this week attempting to debunk concerns that have been raised about educational outcomes for boys.  The AAUW report received significant press coverage, including articles in the WSJ and NYT

But the AAUW report simply debunks a strawman — er, I mean — strawperson.  The report defines its opponents in this way: “many people remain uncomfortable with the educational and professional advances of girls and women, especially when they threaten to outdistance their male peers.”  Really?  What experts or policymakers have articulated that view?  The report never identifies or quotes its opponents, so we left with only the Scarecrow as our imaginary adversary.

Once this stawperson is built, it’s easy for the report to knock it down.  The authors argue that there’s no “boy crisis” because boys have not declined or have made gradual gains in educational outcomes over the last few decades.  And the gap between outcomes for girls and boys has not grown significantly larger. 

This is all true, as far as it goes, but it does not address the actual claims that are made about problems with the education of boys.  For example, Christina Hoff Sommers’ The War Against Boys claims: “It’s a bad time to be a boy in America… Girls are outperforming boys academically, and girls’ self-esteem is no different from boys’. Boys lag behind girls in reading and writing ability, and they are less likely to go to college.”  Sommers doesn’t say that boys are getting worse or that the gap with girls is growing.  She only says that boys are under-performing and deserve greater attention. 

Nothing in the new AAUW report refutes those claims.  In fact, the evidence in the report clearly supports Sommers’ thesis.  If we look at 17-year-olds, who are the end-product of our K-12 system, we find that boys trail girls by 14 points on the most recent administration of the Long-Term NAEP in 2004 (See Figure 1 in AAUW).  In 1971 boys trailed by 12 points.  And in 2004 boys were 1 point lower than they were in 1971. 

In math the historic advantage that boys have had is disappearing.  In 1978 17-year-old boys led girls by 7 points on the math NAEP, while in 2004 they led by 3 points.  (See Figure 2 in AAUW)  Both boys and girls made small improvements since 1978, but none since 1973.

Boys also clearly lag girls in high school graduation rates.  According to a study I did with Marcus Winters, 65% of the boys in the class of 2003 graduated with a regular diploma versus 72% of girls.  Boys also lag girls in the rate at which they attend and graduate from college.  While boys exceed girls in going to prison, suicide, and violent deaths.

It takes extraordinary effort by the AAUW authors to spin all of this as refuting a boy crisis.  They focus on how the gap is not always growing larger and that boys are sometimes making gains along with girls.  They also try to divert attention by saying that the gaps by race/ethnicity and income are more severe.  But no amount of spinning can obscure the basic fact that boys are doing quite poorly in our educational system and deserve some extra attention.

To check out what other bloggers are saying on this report see Joanne Jacobs, and just this morning, Carrie Lukas in National Review Online.


Odds and Ends

May 3, 2008

Florida just passed an expansion of its tax-credit funded scholarship program (read: vouchers).  You can see the bill here.  So much for vouchers being politically dead.

Also here is a neat new study in Education Next by William Howell and Marty West that finds that Americans seriously under-estimate how much is spent per pupil in K-12 public education as well as how much teachers are paid. 


More Special Ed Voucher Study

May 3, 2008

I’ll be on C-SPAN with Marcus Winters Monday morning at 9 am ET to discuss our new study on special education vouchers.  The show is Washington Journal, which has a call-in format.  We look forward to hearing from you!

You can find links to the study and some op-eds in the announcement below:

New Report by Jay Greene and Marcus Winters
 Manhattan Institute senior fellows Jay P. Greene and Marcus Winters have released a new report entitled, “The Effect of Special Education Vouchers on Public School Achievement: Evidence from Florida’s McKay Scholarship Program.” The authors conclude that the McKay program has had a positive effect on the quality of education that public schools provide to disabled students.
To read the report, click here.


WASHINGTON TIMES SERIES:Winters and Greene have been featured in a three part series for The Washington Times, read their articles below.The Politics of Special-Ed Vouchers Jay P. Greene and Marcus Winters, Washington Times,05-01-08
Vouchers for special-ed students Jay P. Greene and Marcus A. Winters, Washington Times, 04-30-08
Vouchers and Special Education Marcus A. Winters and Jay P. Greene, Washington Times, 04-29-08


OP-ED:
A Special-Ed Fix, Jay P. Greene and Marcus A. Winters, New York Post, 04-30-08
INTERVIEW:
An Interview with Marcus Winters: Special Education Vouchers, EdNews.org, 4-30-08       

UPDATE

Here’s another op-ed in an Arizona newspaper.