The TIMSS Rorschach Test

December 9, 2008

The Rorschach inkblot test is a psychology test that was used to assess personality and emotions.  The way in which people saw ambiguous images, like the one above, was supposed to say something about who they really were.

The same is true for the interpretations being applied to the results of the 2007 TIMSS (Trends in International Mathematics and Science Study) released today.

Over at Flypaper, Mike Petrilli interprets the gains the US has made in math but not science as suggesting that accountability testing is shifting resources toward math and away from science: “The lesson is that what gets tested gets taught. Under the No Child Left Behind act, and state accountability systems before that, elementary schools have been held accountable for boosting performance in math and reading. There is evidence that American elementary schools are spending less time teaching science, and this is showing up in the international testing data.”

And Mike interprets the relatively good results that Minnesota had (yes, MN took the test as if it were a country) as supporting rigorous standards: “There’s also good news out of Minnesota today, which has made dramatic gains since adopting new, more rigorous math standards.”

But also at Flypaper, Diane Ravtich offers different interpretations.  She sees the gains even in math results as “actually small, only four points.”  She also declines to credit NCLB for any of those gains, even as a perverse result of resource shifting away from science.  She notes that gains were at least as large in the US during the period prior to implementation of NCLB.  And on the topic of Minnesota she takes issue with Mikes explanation for success: “Minnesota showed dramatic gains on TIMSS not because of ‘new, more rigorous standards,’ but because of that state’s decision to implement a coherent grade-by-grade curriculum in mathematics.”  Umm, I would explain the difference but I got so bored trying to distinguish standards from curriculum that I dozed off for a bit.

Rather than focusing on the gains (or lack of gains) made by the US relative to itself in the past, Mark Schneider at Education Week focuses on the comparison between the US and other countries.  He notes that while the US looks relatively strong on the TIMSS, that is distorted by the large number of  “low-performing countries in the calculation of the international average [including Jordan, Romania, Morocco, and South Africa that] drives down that average, improving the relative performance of our students.”

He further notes that we fare worse on the PISA, which reports results from the 30 OECD countries who are our major trading partners and economic competitors: “We do better in TIMSS than we do on PISA, but this is a function of the countries that participate in each, and we should not let the relatively good TIMSS results lull us into a false sense of complacency. Even in the relatively easier playing field of TIMSS, we are lagging far too many countries in overall math performance and in the performance of our best students.”

And at Huffington Post Gerald Bracey was able to offer his reaction to the results last week, before they were released.  He wrote: “It might be good to keep a few things in mind when considering the data:

1. The Institute for Management Development rates the U. S. #1 in global competitiveness.

2. The World Economic Forum ranks the U. S. #1 in global competitiveness.

3. The U. S. has the most productive workforce in the world.

4. “The fact is that test-score comparisons tell us little about the quality of education in any country.” (Iris Rotberg, Education Week June 11, 2008).

5. ‘That the U. S., the world’s top economic performing country, was found to have schooling attainments that are only middling casts fundamental doubts on the value, and approach, of these surveys…'”

Bracey also said that our students could beat up the students in other countries with higher TIMSS scores.  (Actually, I made that last bit up.)

To summarize, Mike Petrilli sees evidence supporting his past concerns about the narrowing of the curriculum and the need for rigorous standards.  Diane Ravitch sees no evidence to alter her negative view of NCLB.  Mark Schneider, the former head of the National Center for Education Statistics, sees the need to review more testing.  And Gerald Bracey doesn’t even have to see the results to know that our education system is doing a great job.  And when I look at the inkblot I see a pudgy guy with a beard and male-patterned baldness laughing.

(edited for clarity)


New Math

December 9, 2008

Those concerned about new-fangled math instruction should be aware that new math isn’t so new.  Here’s a classic Tom Lehrer song on the topic — from 1965!


No Consumer Left Behind

December 8, 2008

The news is reporting today that the Republican (last time I checked) Bush Administration and Congressional Democrats are close to an agreement to bailout the auto industry.  The terms of the deal involve a $15 billion bridge loan and a federal oversight board.

It’s now becoming clear that rather than moving K-12 public education to look more like a competitive market, we are moving the competitive market to look more like K-12 public education.  To assist in those efforts (can’t nobody say JPGB never did nothing for the peoples), I would like to propose the No Consumer Left Behind act.  You don’t even need a new acronym!

Under the No Consumer Left Behind act we will provide a system of goals and assistance to ensure that all companies serve their consumers effectively.  No longer will we have stigmatizing terms like “bankruptcy.”  Instead, we will have “companies in need of improvement.” 

All companies will have to achieve profitability by 2014.  And they can define for themselves what “profitability” really means.  Each year they must make adequate yearly progress toward that goal.  If a company fails to make AYP they must offer their consumers the option to buy a different product that the same company sells.  After all we have to have choice!

Companies that are in need of improvement will also be provided with additional resources and professional development.  If we don’t help them, how else can they help their consumers?  We won’t call these additional resources a bailout or reward for failure.  Instead, we will call it technical assistance.  It’s just technical — like a technical foul.

We will also require all companies to employ “highly qualified” workers.  Highly qualified will generally be defined as whoever they currently employ.  Alternatively, highly qualified can be restricted to workers possessing union-approved credentials.

If a company fails to make AYP for several years, it will have to “restructure.”  But restructuring won’t be like the old bankruptcy restructuring, where you have to sell assets or layoff workers.  Instead, it can mean that you held some team-building workshops or hired a new CEO.  This new NCLB will be all about accountability.

And as you can tell from the title of the proposed law — we are doing all of this because we care about the consumer.  By focusing on companies in need of improvement, offering product choice within companies, providing additional resources to companies, requiring highly qualified workers, and redefining restructuring to mean essentially nothing, we are taking all of the steps necessary to help the companies — err, I meant consumer.

(HT: Bob Maranto)


The Humpty Dumpty Arkansas Courts

December 7, 2008

Courts claim to be in the business of interpreting the meaning of laws.  But the oddly limited or expansive meanings that are selectively applied to the words in those laws suggest that they are engaged in a completely different enterprise — namely, politics.  The idea that courts are just another political institution has long been held by political scientists, including myself.  We tend not to be hypnotized by the black robes, marble columns,  and Latin jargon into buying the notion that judges are some sort of special priesthood, immune from and indifferent to politics.  

Judges are just regular pols without the typical reelection pressures but also without the typical resources to advance their agenda.  Legislators have the power of the purse while executives have the power of the sword, but judges just have the power of their word.   The limitation on the power of judges is not the constraint of reelection, but the constraint of having to convince the other branches and the public to do what they say.  Cultivating the image of a disinterested priesthood enhances the power of judges to get others to do what they say.  But if the judges demand too much, they undermine their priestly image and erode their future power. 

Judges have been in a particularly strong position to get others to do what they say for the last five decades.  Early in the civil rights struggle our democratic institutions failed us, protecting obviously unjust and illiberal practices.  After initially siding with these illiberal forces (see Dred Scott or Plessy), the Courts detected a shift in elite opinion and joined forces with those elites to consolidate a new, progressive coalition.    The Courts could rightly take credit for having helped rescue us from the failure of our democratic institutions. 

Because they were instrumental in civil rights,  judges accumulated a considerable amount of political capital and popular goodwill.  And they’ve been spending that political capital ever since.   The civil rights era gave the Courts the role as guardians of our liberal virtue.  So, it’s hard to suggest that the Courts have overstepped their bounds, usurped the power of other branches, or arbitrarily interpreted the law without being accused of opposing the liberal virtue that Courts are supposed to protect.  Past critics of over-reach by Court included segregationists, so if you criticize judicial over-reach today on some other topic you must also be a segregationist.

This is especially true in Arkansas, where the memories of desegregation battles at Little Rock’s Central High School are particularly painful.  You cannot criticize Arkansas Courts for over-stepping their bounds or abusing their authority without being accused of being Orval Faubus — and there is no worse political insult in Arkansas.  The problem with immunity from legitimate criticism is that Arkansas Courts are especially unaccountable for judicial over-reach or arbitrariness. 

The most salient recent example of this is the action of the state Supreme Court in the Lake View school funding case.  The state constitution does  say that the state must “maintain a general, suitable and efficient system of free public schools.”  But who knew that general, suitable, and efficient meant that there was a specific dollar amount that had to be spent on every student in Arkansas?  And who knew that that amount had to increase by at least the rate of inflation every year?  I doubt that the authors of the Arkansas Constitution knew that general, suitable, and efficient meant all of these things, but the members of the Arkansas Supreme Court sure did.  And they figured out how much the legislature needed to spend per pupil and for school infrastructure by appointing Special Masters, who convened public meetings, received testimony from interested parties, and wrote a report summarizing their recommendations. 

Of course, there already exists a body for holding public meetings, receiving testimony from interested parties, and deciding upon the appropriate levels of public spending — it’s called the legislature.  With the appointment of Special Masters the Arkansas Supreme Court clearly usurped the legislature’s power.  And the Special Masters showed no restraint in determining spending priorities for the state — a power reserved by the Constitution for the legislature.  They declared: “[School districts] should have the means to meet the challenge if the State remains committed to the all-important practice of funding education first.”  Where in the state Constitution does it say that education has the first priority on resources? 

Some have argued that the responsibility to fund education first is implied by having education as the only policy area specifically mentioned in the Constitution.  I’m sorry to say that these people have never read the Arkansas Constitution.  It also specifically mentions a number of other policy areas, including the need for an agriculture, mining, and manufacturing policy.  Specifically, it says that the legislature must pass laws to “foster and aid the agricultural, mining and manufacturing interests of the State.”  If the Court and its Special Masters see the words general, suitable, and efficient as meaning that education must be supported as the first priority and at a specific, ever-increasing amount of spending, why haven’t they interpreted “foster and aid” to mean that the legislature must provide specific subsidies to agriculture, mining, and manufacturing?

Clearly we have a Humpty Dumpty Court.  The words mean what they want them to mean.  General, suitable, and efficient have expansive meanings if it suits their purposes while foster and aid mean essentially nothing.  Only judges, as the special class of high priests, possess the magical glasses that allow them to read between the lines of the Constitution to see that one phrase implies the moon while the other implies bupkis.

And now the Arkansas Supreme Court is at it again.  They are currently hearing arguments on whether a state law exempting state contracts in excess of $5 million from competitive bidding violates the state Constitition.  A plain reading of the text would suggest that it does.  The Constitution states: “All contracts for erecting or repairing public buildings or bridges in any county, or for materials therefor shall be given to the lowest responsible bidder, under such regulations as may be provided by law.” 

But Circuit Judge Jay Moody ruled that the state law did not violate the Constitution because he interpreted the provision as only applying to contracts from county governments — not contracts made by the state government and its agencies.  I’d like you to re-read the constitutional provision and ask yourself whether this is the most reasonable interpretation of the language.  Doesn’t the phrase “in any county” seem to describe the location of public buildings and bridges, emphasizing that the bidding requirement applies in all parts of the state, not the government agency engaging in the contracting?

We don’t know how the state Supreme Court will rule on the matter, but figuring that out requires a political, not a linguistic analysis.  They can and will interpret it in any way the see fit to advance their interests.  The words can mean just about anything they want them to mean.  “The question is which is to be master — that’s all.”

UPDATE:  The Arkansas Supreme Court interpreted the cluase as applying only to county contracts and upheld the state law.  The decision can be found here.


Violating the Denominator Law

December 2, 2008

Sean Corcoran, who is guest blogging for the blogger formerly known as Eduwonkette, may have to go to education research jail because he violated the Denominator Law today.  For those of you unfamiliar with the Denominator Law from my previous post on the topic (and ignorance of the law is no excuse) it is: “No one should be allowed to highlight numerators without also presenting denominators.  That is, it is often misleading to describe a big number without putting that number in perspective.”

So, Sean is all worried about private donations to public schools creating or exacerbating inequities in funding.  He references a report about California (and it had better be peer-reviewed or the blogger formerly known as Eduwonkette will throw a fit) that finds: “contributions to California school foundations rose from $123 million in 1992 to $238 million in 2001.”  He does helpfully add that $238 million only amounts to $40 per pupil.  But he doesn’t fully comply with the Denominator Law because he fails to point out that $238 million only represents .4% of the $52.2 billion in total public school revenue in California in 2001.

It’s not the average amount of private giving in California that really worries him.  What concerns him is that these donations are concentrated in wealthy areas: “Of course—as Brunner and Imazeki point out—these contributions are far from evenly distributed. Donations are strongly related to family income, and in some cases they are quite high, at more than $250 to $500 per student. (You can read about the $3.3 million education foundation in Santa Monica-Malibu Unified School District here).”

Here, your honor, is where he flagrantly breaks the Denominator Law.  He suggests that $250 to $300 per pupil, as illustrated by $3.3 million in private giving to public schools in Santa Monica, is “quite high.”  Without a denominator, it’s hard to judge how high $3.3 million in Santa Monica really is. 

Let me help.  According to the School Matters web site operated by Standard and Poor’s, Santa Monica has 12,191 students.  The private contributions Corcoran mentions amount to $271 per pupil — within his $250 to $300 range.  But total revenue for Santa Monica public schools amounted to $11,062 per pupil as of 2006.  Private contributions of $271 amount to only 2.4% of total revenue — not exactly “quite high.”

And this private giving hardly accounts for resource differences between Santa Monica and the average district in California.  According to School Matters the average district in CA had total revenue of $9,553 as of 2006, $1,509 less than in Santa Monica.  If Santa Monica received $271 in private donations compared to $40 for the average California district, the extra $231 could only account for about 15% of the extra resources Santa Monica possesses. 

If this is the worst case that folks can muster, it hardly seems like private giving is a significant contributor to resource inequities.  We only gain this appropriate perspective when we comply with the Denominator Law — so be sure to follow the law out there.


Replication, The True Test of Research Quality

December 2, 2008

When people can’t argue the facts, they argue peer review.  That’s been my experience when I’ve released non-peer reviewed reports.  Without peer review, folks wonder, how can we know whether to trust these results?

The reality is that even with peer review people still need to wonder whether to trust results.  Peer-review is by definition irresponsible — by which I mean that the reviewers have no responsibility.  By being anonymous, reviewers offer their opinions on the merit of research without any meaningful consequence to themselves.  Many reviewers do a laudable job, but there is nothing to stop them from using their reviews to advance findings they prefer and block findings they dislike regardless of the true merit of the work.  Peer-review is often little more than the anonymous committee vote of a panel composed of some mix of competitors and allies.  It is about as reliable as the Miss Congeniality vote at a beauty contest.  Do we really think she’s the nicest contestant or did the other contestants voting anonymously have ulterior motives for burying her with faint praise?

The true test of research quality is replication.  Science doesn’t determine the truth by having an anonymous committee vote on what is true.  Science identifies the truth by replicating past experiments, applying them to new situations, to see if the results continue to hold up. 

I’m pleased to say that several pieces of my work have been successfully replicated.  By successful replication I mean that the basic findings are upheld.  Replicators almost always make new and different choices about how to handle data or run an analysis.  The question is whether the same basic conclusion is found even when those different choices are made.

The evaluation I did with Paul Peterson and Jiangtao Du of the Milwaukee voucher experiment was successfully replicated by Cecilia Rouse.  The evaluation I did of the Charlotte voucher program was successfully replciated by Josh CowenMy study of of Florida’s A+ voucher and accountability program was successfully replicated three times — by Raj Chakrabarti; Rouse, et al; and West and Peterson.  And my graduation rate work has been successfully replicated by Rob Warren and Chris Swanson.

The interesting thing is that every one of my studies above was initially released without peer review.  And every one of them was attacked for being unreliable because they were not peer reviewed.  When they were all later published in peer reviewed journals (except the grad rate work) and successfully replicated I don’t remember ever hearing anyone retract their accusations of unreliability. 

(edited for typos)


In Defense of the BCS

December 1, 2008

Barack Obama has his finger on the pulse of American public opinion.  So when the president-elect came out in support of an 8 team college football playoff to replace the current BCS-selected match-up of the top two teams, he was endorsing a view held by 97.4% of all football fans.  This stat comes from the same source that found that 73.8% of all statistics are made up on the spot.

I, however, am among the 2.6% that prefers the current BCS method.  Why? — because an 8 team playoff solves virtually none of the supposed injustices of a BCS-selected championship game and because playoffs create significant, new problems.

The main injustice that a playoff is supposed to prevent is the exclusion of worthy teams from competing in the post-season for the national championship.  The current system uses a formula combining coach and journalist rankings of teams with computer models of team performance given the difficulty of their schedules to identify the top two teams in the country.  Those two teams then play for the national championship. 

“But what about the third ranked team?” opponents of this system ask.  Shouldn’t they have a chance to compete for the championship also?  This concern for injustice is compounded by disputes over whether the top two teams identified by the BCS really are the two best teams.  People become particularly passionate about this if their team is the one ranked 3rd (or even 4th, 5th, etc…).  And the fact that computer models have a hand in selecting the top two teams only fuels the technophobe football fan rage.  The intensity of opposition to BCS ratings is almost always inversely related to a person’s ability to do algebra (or even compute simple sums).

Moving to an 8 team playoff doesn’t really solve this perceived injustice.  Instead of arguing over whether the 3rd ranked team was unjustly excluded from competing in the post-season for the national championship, we’ll just argue about whether the 9th ranked team was unjustly excluded.  You have to draw the line somewhere.

In addition, there has to be some method for selecting the 8 teams.  If you don’t like relying on computer models and polls, try to describe a system that would more accurately identify the best teams.  Some have suggested providing guaranteed spots to the winners of 6 of the most competitive conferences with two additional teams selected at-large.  But it’s not hard to imagine the injustices that would flow from such a system.  Who gets to pick the 6 conferences?  Why shouldn’t the 7th conference have a guaranteed spot?  What if there are two top-notch teams in a conference?  How will we select the two at-large teams?  The bar arguments will never end no matter how we select teams.

The virtue of the BCS method of ranking is that it combines multiple reasonable methods into a single rating.  It incorporates the subjective judgment of experts as well as the dispassionate computer assessment of team schedules.  Sure, the BCS, like any rating system, will be imperfect.  But its methodology is reasonable and the rules are clearly stated in advance.

The only question remaining is why only have 2 teams in the post-season instead of 4 or 8 (or 16 for that matter).  I’ve already argued that drawing the line anywhere is somewhat arbitrary and would produce disputes and claims of injustice.  But others might respond that it is better to have more teams included in the post-season than fewer. 

The problem with expanding the post-season to include more teams in the national championship race is that it would require more games to be played.  You cannot add games to college football without a price.  Other than among advocates of the ginormous financial bailout, everyone understands that there is no such thing as a free lunch.  Extra college games come at a cost.

If we simply add two more games to the post-season to have an 8 team playoff, we are requiring players to have longer seasons with greater opportunities for injuries.  Remember that college football players are uncompensated young students (and free tuition hardly qualifies as fair compensation given how much revenue they generate).  If we make them play longer seasons, they run a significantly higher risk of suffering debilitating injuries that could ruin any hopes for a professional football career and/or turn them into life-long cripples.  Barack Obama and 97.4% of all football fans may not care about exploiting unpaid college kids for our entertainment, but I think there have to be limits.

I suppose we could instead shorten the regular season by two games to avoid making players extend their season.  But if we do that we will reduce the information from the regular season for determining who deserves to be in the playoffs.  We’ll also deprive the vast majority of college football programs and their fans of two games and the revenue those games produce.  Again, there is no free lunch.

People wonder why college football is the only major sport without a playoff.  But college football is different from other sports.  Football is so brutal that it can only be played once a week and even then the probability of serious injury increases dramatically with each additional game.  We can expect the pros to play longer and run those risks because, well, their pros.  They are paid (although not nearly enough — but that is a story for another day), while college athletes are virtually unpaid (and that is an injustice that should also be corrected — but that is also a story for another day).  I’d rather have a bunch of bar arguments over whether the 3rd ranked team was unjustly excluded from the championship game than significantly increase the exploitation of college football players.


Guess Who Wants A Bailout

November 25, 2008

A major industry has gotten in line to receive a bailout.  It directly employs more than 6 million people.  That’s a lot of people considering that there are a total of 300 million men, women, and children in the US of whom 137 million are currently employed (excluding farmers).  So the workers in this industry constitute about 4% of all workers in the US.

Those 6 million workers directly serve almost 50 million customers.  While recent figures are not available, the industry had revenue of about $536 billion as of 2006 when total US GDP was $13.13 trillion.  So this industry constitutes about 4% of total US GDP.

Despite its size and importance, this industry has a notorious track record of performance.  It fails to complete more than a quarter of the products it starts.  Even among those it does finish, almost 40% fail to meet basic standards for quality.  Quality has not improved a smidge in over three decades despite more than doubling the average cost of production.  And foreign competitors are cleaning our clocks.  In a comparison of 21 industrialized countries, US quality exceeded only that of South Africa and Cyprus.

And this industry has huge and understated pension liabilities that, failing a miraculous improvement in the returns on investments, will inevitably have to be paid by taxpayers.  These “legacy” costs are consuming an increasing share of resources and distorting labor markets, hindering an industry turnaround.  But the unionized workforce continues to press for increased pay and benefits while opposing restructurings that might address quality-control problems.

Despite an unwillingness to correct its structural weaknesses, either controlling costs or improving quality, captains of this industry are appealing to politicians for a bailout.  As one recently said, “‘The most commonly heard solution out of Washington these days is a bailout where the federal government intervenes to safeguard key industries and in the process, the quality of American life.  If that’s the rationale, than I cannot think of a more strategic investment than safeguarding the quality of [our industry].”

Are we talking about the US auto industry?  It sounds like we could be, but I’m sure most of you have guessed that the industry described here is the US K-12 public education industry. 

And who is it that is requesting the bailout on behalf of K-12 public education?  None other than Alberto Carvalho, the superintendent of Miami-Dade schools.  This is the same Alberto Carvalho who manipulated a romantic relationship with a Miami Herald reporter to advance his career.  I guess when he’s not busy with naughty text messaging, he’s making the case for an education bailout: ”The question in my mind is this: At a time when we’re continuing the bailout of key industries, at what point do we have a bailout of public education?”

Watching folks scramble for bailout funds is like watching pigs at the trough.  It’s only a matter of time until Starbucks gets in line.  After all, the US economy needs liquidity.

(edited to note that it is K-12 public education)


Moe in WSJ

November 24, 2008

Terry Moe has an excellent piece in the Wall Street Journal today.  He suggests that the Democrats (including himself as an early Obama supporter) are the logical source of education reform.  He writes:

“If children were their sole concern, Democrats would be the champions of school choice. They would help parents put their kids into whatever good schools are out there, including private schools. They would vastly increase the number of charter schools. They would see competition as healthy and necessary for the regular public schools, which should never be allowed to take kids and money for granted.”


Education Next Ranks the Blogs

November 21, 2008

Mike Petrilli has a piece in the new issue of Education Next that ranks some of the most prominent education policy blogs.  The JPGB (that’s Jay P. Greene’s Blog) was ranked 10th according to Technorati’s authority measure, which counts the number of links to a web site in the last 180 days.  JPGB came in just behind Flypaper, to which Petrilli contributes and which was started at about the same time as JPGB.

But Education Nextis part of the dead wood media and the numbers are out of date.  They’re like so two months ago.  Rob Pondiscio over at Core Knowledge has more current numbers and added some other blogs to his list based on what was in his bookmarks.  Here is what he found:

Blog                Technorati Rank       Google Rank

Joanne Jacobs              217                    6
Eduwonkette               167                    6
Eduwonk                     146                    7
Campaign K-12           125                    6
The Education Wonks  119                    6
Flypaper                       95                     5
Jay P. Greene          93                 6
The Quick and the Ed  87                      6
Matthew K. Tabor         85                     6
Core Knowledge     84                  5
This Week in Education  79                   5
Edwize                         74                     6
Intercepts                   69                      4
Schools Matter           68                       5
Bridging Differences   66                      6
D-Ed Reckoning        56                        5
Edspresso                  46                        5
NCLB Act II                40                        5
Sherman Dorn           39                        5
Eduflack                    29                        5
Swift and Change Able 27                     5
Thoughts on Education Policy 25          4

UPDATE:  I’ve added the Google Page Rankings, which you can identify for any web site here.  Unlike Technorati, which just counts links to a site, Google Page Rank weights links by how many links the other sites receive.  This seems like a better approach but unfortunately the Google Page Ranks are only provided on a 1 to 10 scale.  Using it, Eduwonk is the king of the education policy blogs, not Eduwonkette.