(Guest Post by Matthew Ladner)
So Michelle Malkin recently wrote columns of an alarmed tone warning of the dangers of the Common Core. Here is a taste:
Under President Obama, these top-down mal-formers — empowered by Washington education bureaucrats and backed by misguided liberal philanthropists led by billionaire Bill Gates — are now presiding over a radical makeover of your children’s school curriculum. It’s being done in the name of federal “Common Core” standards that do anything but set the achievement bar high.
Substitute the word “conservative” for “liberal” and the paragraph reads like Diane Ravitch. Ms. Malkin proceeds to repeat various anti-Common Core assertions as facts-but are they facts? Having read that last bit about “standards that do anything but set the achievement bar high” I decided to put it to a straightforward empirical test.
Kentucky was the earliest adopter of Common Core in 2012, and folks from the Department of Education sent some before and after statistics regarding 4th grade reading and math proficiency. I decided to compare them to NAEP, first 2011 KY state test and 2011 NAEP for 4th Grade Reading and Math. NAEP has four achievement levels: Below Basic, Basic, Proficient and Advanced. Kentucky also has four achievement levels: Novice, Apprentice, Proficient and Distinguished. The first figure compares “Proficient or Better” on both NAEP and the state test in 2011:
As you can see, Kentucky’s definition of “Proficient” was far more lax than that of NAEP. In the Spring of 2012 however they became the first state to give a Common Core exam. How did the 2012 state results compare to the 2011 NAEP?
Kentucky’s figures are strongly suggestive that the new test is a good deal more rigorous than the old one- it tracks much closer to NAEP than the previous test. While it is possible that Kentucky had item exposure that explains some of the difference, but let’s just say there is an awful lot of difference to explain. We would expect somewhat lower scores with a new test, but if the new test were some dummied down terror…
There will also still be honest differences of opinion over standards independent of the rigor of the tests. Moreover, just because it is an obnoxious pet-peeve of mine, it is worth noting that starting out more rigorous doesn’t guarantee that they will stay that way…
A formal study could definitively establish the rigor of the new Kentucky test definitely vis-a-vis NAEP, but it is well worth considering where KY’s old test ranked in such a study by NCES. Short answer: Kentucky’s old standards were high-middle when compared to those of other states. Ergo we can infer that the proficiency standard on the KYCC test is far closer to those of NAEP than a large majority of current state exams.
There is room for honest debate regarding Common Core as a sustainable reform strategy, but we should have that debate rather than the present one.
UPDATE: Reader Richard Innes detected an error in the NAEP proficiency rates in the first version of this post. I made the mistake of looking at the cumulative rather than the discrete achievement levels and then treating the cumulative as discrete-thus double counting the NAEP advanced. If you have any idea of what I am talking about give yourself a NAEP Nerd Gold Star. Getting instant expert feedback is one of the best things about blogging, and I have updated the charts to correct the error.
In terms of substance, both sets of KY tests were further apart from NAEP proficiency standards, but the new ones are still far closer than the old ones.
“Kentucky’s figures are strongly suggestive that the new test is a good deal more rigorous than the old one- it tracks much closer to NAEP than the previous test.”
Perhaps, perhaps not. No question it is *harder* but is it more rigorous? Look at this NAEP item (http://tinyurl.com/an2r3wa ) — a perfect example of an item that has difficult numbers, but rather trivial concept. Does it make the test more “rigorous” for 8th graders? Clearly not.
“There will also still be honest differences of opinion over standards independent of the rigor of the tests.”
Indeed. Measuring the quality of the standards by its assessment is like evaluating the quality of a car by how easy it is to pass a driving test in it. There may be some very tenuous connection, but can you argue with a straight face that passing rates are higher in Mercedes and BMW than in Hundai or Kia?
Standards are about what is taught and at which grade. Assessment — every decent assessment!! — is aligned with the standards, whether the standards are excellent, mediocre, or worthless. And, as I just demonstrated above, one can make any test harder or easier, independent of whether the concepts are harder or easier.
Finally, you correctly say that “starting out more rigorous doesn’t guarantee that they will stay that way.” In fact, the situation is worse: There cut scores are not set yet!! The Kentucky pilot test set essentially arbitrary cut-scores, largely because it wanted to test the waters and see the public reaction. Only *after* the actual testing will occur next year will PARCC set the cut scores. I can promise you they will be set by a *political* process more than by anything else.
Which brings me to Kentucky, which you praised as “the earliest adopter of Common Core in 2012.” What you should have mentioned is that Kentucky adopted the Common Core standards even before they were finished … so much for the educational diligence of the great state of Kentucky.
The only way to ascertain the quality of the standards before full-scale implementation is by expert review. And the experts had their say. Sandra Stotsky and Jim Milgram, the only content experts on the Common Core Validation Committee refused to sign on them because they found them not rigorous and below our international competitors. The British Deputy Director of the UK Institute of Education, another validation committee member, refused to sign on them for the same reasons. Jonathan Goodman, a Courant Institute mathematics prof, found them 1-2 years below international levels. And even Andrew Porter, Dean of Penn GSE, as “establishment” as they come, found them mediocre. Bill Schmidt had to resort to magician tricks of re-arranging tables to present them as “similar” to the TIMSS A+ countries — hard to believe anyone would ever go lower than he did.
I take your point about cut scores. The Kentucky officials I corresponded with reported that they set their cut scores by following the methodology laid out by the CC effort. Time may prove you correct that the overall CC effort does not follow their own cut score methodology, but it isn’t the case that KY officials set their scores out of thin air.
ML – good blog and good comments too.
I agree that cut score setting was, is, and will be a political process.
Question: I wonder if you’d elaborate on how you see the relation of standards and assessment. I offer mine just below to invite contradiction.
My view in Massachusetts is that many teachers in a “tested grade” ignore the standards documents themselves and focus on the test. They use released MCAS exams to backwards map the curriculum, and increasingly interim proxies like ANet for same.
That is, the MCAS itself serves as the “true” list of standards, not the document titled “standards.”
I’ve always thought this could be empirically studied. Has it been?
I.e., survey where where we a) ask teachers with “tested subjects” to describe the state exam, and then b) ask them to describe standards that are NOT on the exam.
If I’m right, then most teachers could do “a” but not “b.”
If teachers react more to the standards than the test, then they could do “a” AND “b.”
Regarding Mike G.’s comment:
If Massachusetts’ teachers are mostly just teaching to the tests, they must be good ones because Massachusetts also leads the nation on the NAEP, as well.
It will be interesting to see what happens once Massachusetts comes on line with CCSS, as it currently is committed to do. For that state, I suspect a serious reduction in educational performance may result with CCSS.
1. First, good catch on the KY data.
2. Precisely. See the comparative data at the bottom of the page. MCAS and NAEP as tests are generally aligned in results.
The relation between standards and test is complex and nobody really understands it well. What we do know:
Good standards do not necessarily lead to high results on the test. For example, both Mass. & Calif. standards were highly praised for a long time, yet Mass. moved with them to the top of the heap over time, while California largely meandered far below. Similarly, next 3 states that got just below Mass. on the NAEP math — MN, NJ and VT — their math standards were rated B, C, and F respectively.
Having said that, it does not necessarily mean that standards have no impact — good ones may be helpful overall even if mediated by other issues like implementation. Further, it may still mean that mediocre standards are clearly harmful while good ones are only *potentially* helpful. For example, if content holes are present in the standards, there is little to mediate them — that content will simply not be taught. On the other hand, if the standards don’t specify content almost at all it may earn them an F — like in the case of VT — while it probably indicates that schools simply don’t look to standards for guidance, and hence their impact is probably minimal despite their low grade.
What one can say about the CC is that: On one hand they are quite elaborate and prescriptive, often annoyingly so. On the other hand they are not very rigorous — certainly when compared internationally — and yet they have significant issues and holes. And they will have a highly prescriptive and highly consequential test *precisely* attached to them, which will drive schools to teach to the test.
It is difficult to make predictions, particularly about the future (smile) but the CC contains the seeds of the worst possible scenario: mediocre and experimental standards fostered quite rigidly through highly consequential test on tens of millions of students without any serious piloting. Talk about a recipe for a widespread disaster, even ignoring the federal intrusion aspect.
Thank you Ze’ev for the thoughtful and detailed reply.
Regarding Mike G. and Ze’ev’s comments.
I would not be so quick to dismiss California on NAEP. You MUST look at the disaggregated scores by race to begin to understand this state because they have experienced a Tsunami of non-English speaking Hispanic immigration.
I don’t have a lot of time, but I did a quick check of California’s Grade 4 NAEP performance in 1996 and 2011. In 2011, California tied the national average scores for whites and blacks (actually both California scores were slightly higher, but not high enough to overcome the sampling error in NAEP).
In 1996 both groups scored statistically significantly lower than the national average.
That is notable progress.
In 1996 California’s Hispanics scored 196 and the nationwide Hispanic score was 204. Never the less, the NAEP Data Explorer significance test says that 8-point difference was not statistically significantly different.
In 2011 California’s Hispanics scored 222, only seven points behind the national average, but that difference, though nominally smaller, was considered significant. So, thanks to sampling error issues, California would be shown as losing ground for Hispanics even though the actual scores closed slightly.
Of course, in 1996 California was only 34 percent Hispanic in its public schools. By 2011, Hispanics were the majority, and many of them were English language learners (NAEP is only given in English in the continental US), so that suppressed the overall California averages.
Anyway, do not so quickly sell short California’s progress with its back to basics reform that I believe came about somewhere around 1998.
But, you can’t see this in overall NAEP results.
Now, everyone go read this to get smarter about NAEP. You’ll find some examples there about California, by the way.
There was a study done in Washington state a while back that sort of addressed the question: do teachers pay more attention to the test or the standards? I believe the study relied on survey data, and the take-away was that the state assessment trumped the state standards in terms of teacher focus.
Here’s a link:
Thanks, Richard, for elaborating on California. I didn’t want to go into the details that you so helpfully provided, and I left it simply as “largely meandered far below.” If we do want to delve into California a bit more, I would point out to its rather extraordinary success with Algebra in grade 8. Not only did California more than quadruple(!) the share of students taking Algebra by grade 8, but he fraction getting “proficient” or “advanced” keeps steadily growing despite this huge increase in the pool of takers. Even more impressively, the growth in minorities and low SES students successfully taking Algebra by grade 8 far exceed the cohort average.
But all this will now disappear under the Common Core. Algebra is firmly placed by the Common Core in the high school. Middle school Algebra will be — again — reserved only for the children of the affluent. A regression of more than a decade for US education.
So 46 states and DC are signing onto standards inferior to international levels? All that time and money and US schools will be striving for inferiority? Come on. Blow the whole thing up and start over.
My mother always told me, “Don’t compare yourself to the worst, Erin, compare yourself to the best.”
I am surprised by Ladner’s use of Kentucky and the following claim he makes:
“Short answer: Kentucky’s old standards were high-middle when compared to those of other states. Ergo we can infer that the proficiency standard on the KYCC test is far closer to those of NAEP than a large majority of current state exams.”
Here is what Fordham reported on Kentucky’s former standards:
“The Bottom Line: With their grade of D, Kentucky’s ELA standards are among the worst in the country, while those developed by the
Common Core State Standards Initiative earn a solid B-plus. The CCSS ELA standards are significantly superior to what
the Bluegrass State has in place today.”
The Common Core standards are a “dumbing down” for most states. Kentucky, obvious to most, had a long way to go.
The comparison here is between the tests rather than the standards. There are contending opinions regarding the quality of the standards, but as an empirical sort of guy I don’t know what to make of that-it seems a bit like watching a group of art critics argue over a painting to me. The plump movie critic loves Pulp Fiction, but the skinny one gives it a thumbs down…etc.
So if you Common Core is dumbing down most states- do you have any empirical evidence to support the claim?
By the way, officials in Illinois just voted to substantially raise the cut scores on their state exam in antiicipation of CC. They could be wrong, but they obviously believe the new exam is going to be far more challenging:
A better question would be, is there any empirical evidence that CC will be an improvement for states. With the lack of field testing, I think neither side can answer that question.
I agree entirely that it would have been an important question to ask three years ago. Policymakers have shifted the default in 46 states however, shifting the question from “state standards or CC” to “CC or what?”
It is possible to have bad standards and super high cut scores. For instance, we can imagine an 8th grade math test full of a thousand very simple addition problems but setting the requirement at getting every single one of them correct in order to score “Proficient.”
While it is possible that is what we see going on KY, it seems a bit fantastic for it to actually be the case. If it is however I expect someone will be able to demonstrate it.
Point taken, Dr. Ladner. My thoughts are: 1) What was wrong with NAEP as a cross-national comparison? I have no idea why we need new tests for that purpose. 2) The Kentucky tests are not what PARCC and SBAC will eventually put out. Those tests are still under construction. So we have no idea how well the KY tests will correspond to those.
And, 3) My biggest objection on standard quality is that the CCSS folks promised world class, and delivered mediocre. Of course, mediocre is better than states generally had. But it’s a deception that, if parents and taxpayers really heard, would make them less likely to support this expensive and governance-wise dangerous initiative.
1. I love NAEP as a cross-national comparison, and nothing aggravates me more than when utterly naive Common Core supporters say things like “well now that we have Common Core, we won’t need NAEP.” Oh by all means, let’s get rid of external barometers and hope for the best #morons.
3. True but the process of drawing up standards will inevitably lead to disagreements over what constitutes quality. I’ve written previously that I would much rather have my state adopt the MA standards given that both sides seem to agree that they are both the best currently and better than Common Core.
If the choice is between CC and AZ Learns, I’ll warily take my chances with CC. We don’t have anything to lose after all, and the state is free to leave if CC things go south.
I suggest you revise this post immediately and contact me.
According to the NAEP Data Explorer and the 2011 math and reading report cards, you have the wrong NAEP scores for Kentucky in both reading and math. Our Grade 4 At or Above Proficient percentage in reading is only 35% and its only 39% in math. I don’t know where you got your numbers because even our white only proficiency rates are lower on NAEP in 2011.
Actually, I have already posted a bunch of comparative data in our “K-PREP Data Sourcebook,” on line here:
Click to access K-PREP_Data_Sourcebook.pdf
This is a good tool for researchers on CCSS and lists 8th grade as well as 4th grade comparisons to NAEP as well as other credible, college and career aligned tests from ACT, Inc.
Staff Education Analyst
Bluegrass Institute for Public Policy Solutions
You are quite right- I looked at the Cumulative rather than the Discrete data, thus double counted the “Advanced” category. Making appropriate revisions, thanks for the catch.
One more comment on the NAEP is appropriate. Whenever you do state-to-state comparisons with this federal test, it is important to break the data out, at least by race.
Because of very different student demographics among the states and the continuing severe education gaps, a state like Kentucky with a huge white population (still 84% as of 2011 NAEP testing) gets an amazing, and unearned, advantage when you only take a simplistic look at overall average scores.
You can read a lot about pitfalls in interpreting NAEP here:
I also have a paper published on the subject, which you can locate through this link in ERIC:
These articles show how Kentucky’s apparently good performance starts to come into question as soon as you start looking at statistically valid comparisons of scores for various racial groups. Reading one or the other should be required of any researcher who intends to do anything with the NAEP.
It’s interesting you cite Kentucky as the example of previous educational mediocrity.
KY adopted Common Core after the release of the first (incredibly low quality) CCSS draft and agreed to preview the national test because Gene Wilhoit (former KY ed commissioner and head of the CCSSO) had significant influence there.
So in highlighting KY’s academic mediocrity, you are citing the state work of the head of the major organization driving Common Core? Do you get the irony of that? As we’ve stated many times, Common Core is being driven by the same people and organizations that have failed America’s schoolchildren for decades. Glad you agree.
All the major players behind Common Core – Fordham (check out OH’s average NAEP scores/standards in the last 20 years!), Achieve, Inc. (ADP), Jim Hunt (NC), Bob Wise (WVA), CCSSO, NGA, and Gates — have mediocre records improving student achievement anywhere in the last 20 years.
Jeb’s FL is perhaps, maybe, kind-of an exception. Its gains have moved it from the bottom to an average to below-average performer on the overall NAEP rankings, and the recent NAEP trajectory for FL suggests it has leveled off. FL never had high academic standards – which has proved one of the Achilles’ heels of the set of reforms Jeb implemented.
Interestingly, the Florida Virtual School, which predates Jeb, had more of a focus on high-quality content than the state as a whole.
You might want to look at these blogs on the middling NAEP performances of the major Common Core advocates.
and on FL
I don’t follow your point. State tests are bad almost everywhere but apparently just got better in Kentucky. I’ll pour myself a glass of pure grain alcohol and rainwater to restore my purity of essence and then give the matter some more thought.
Mr. Gass, I also am not following your point.
You appear to be saying “miscreants are responsible for common core” “miscreants are responsible for Kentucky’s prior standards/assessment” and “most are getting mediocre results based upon NAEP scores.”
Mr. Ladner appears to be making the point that common core in Kentucky has dialed in the measures to more accurately reflect the values recorded with NAEP.
I too welcome the debate on common core’s merits, and I could easily be persuaded that principles like federalism, avoiding shared mediocrity, and ideological uniformity are reasons to avoid common core.
But I also see economies of scale, the end of state manipulation of results and thoughtful targeting of the ends sought in public education as valid merits of multi-state standards.
Mr. Ladner has offered a data point for one of the merits–ending state manipulation of results.
So let’s have clash.
I want to reply to one assertion made by Mike G above:
“My view in Massachusetts is that many teachers in a “tested grade” ignore the standards documents themselves and focus on the test. They use released MCAS exams to backwards map the curriculum, and increasingly interim proxies like ANet for same. That is, the MCAS itself serves as the “true” list of standards, not the document titled “standards.”
I speak only for ELA standards but I think the practice extended to math and science as well. Every ELA standard for a grade is assessed on the MCAS for that grade. If you take a look at the released test items, one will find a list showing what standard a particular item was based on. Teachers teaching to MCAS were also teaching to the standards for their grade, and to all of them.
Item difficulty is another matter altogether. But in ELA, one looks at the difficulty level (reading level) of the passages (especially revealing in grade 8 and grade 10) and the quality of the multiple-choice questions and the essay questions. The MCAS essay questions that I have looked at over the years in grade 8 and grade 10 were developed by high school English teachers and were generally superb. They attracted more attention from my class of Arkansas English teachers a few years ago than the passages themselves (whose reading level/literary quality also impressed them). They could recognize a good high-school level essay question when they saw one, and knew that the questions on their own Arkansas state exams in grade 11 were of a much lesser quality. We don’t find high school English teachers asked to do this kind of analysis on, e.g., the SBAC released sample test items for ELA. No psychometrician can come close to the judgment a well-educated English teacher can exercise. Sandra Stotsky
I’m glad scores are up in Kentucky. To evaluate the wisdom of CC, we’ll have to watch that and numerous other indicators, including but not limited to:
-Scores in all the other states that adopt CC, including states whose prior standards were superior to CC;
-The gap between American scores and those of our international competitors, whose standards are higher than CC’s;
-The long term trends that emerge as CC is inevitably captured by the blob and neutralized; and
-The return of the culture wars over school curricula, now operating nationwide instead of just in a few locations.
[…] Ladner brings up Kentucky as an example that the Common Core State Standards are working. Kentucky was one of the first states to implement […]
seo outsourcing company chennai
Extremism in Defense of Mediocrity is Quite a Vice | Jay P. Greene’s Blog