February 17, 2009

Alfie Kohn and the Murray Gell-Mann Amnesia effect Part II

Continuing on from Part I.

We're now getting into the last prong of Kohn's main argument, after which he appears to take a kitchen sink approach and throws in all the remaining negative information he could find.

Kohn's last argument is based on a logical fallacy:

Finally, outside evaluators of the project – as well as an official review by the U. S. General Accounting Office – determined that there were still other problems in its design and analysis that undermined the basic findings. Their overall conclusion, published in the Harvard Educational Review, was that, “because of misclassification of the models, inadequate measurement of results, and flawed statistical analysis,” the study simply “does not demonstrate that models emphasizing basic skills are superior to other models.”

As a preliminary matter, Kohn fails to mention that the "outside evaluators" he's referring to, House et al. were funded by the Ford Foundation which had also funded a few of the losing models in PFT. As such, Kohn's source has "potential bias" issues which Kohn fails to alert his readers to. Kohn also fails to alert his readers to all the other similar "outside evaluators" which analyzed both the PFT data and House's analysis and came to a different conclusion. These other outside evaluators are no more biased than House, so its curious to see why Kohn would fail to mention them.

Next comes the first of Kohn's logical fallacies. Kohn commits the fallacy of division (or whole-to-part fallacy) when Kohn claims PFT "simply 'does not demonstrate that models emphasizing basic skills are superior to other models.'" Even if the basic skills models as a whole weren't superior to the other models doesn't mean that the DI model alone wasn't. The data certainly shows that the DI model was the superior performer.

The point here is that Kohn is criticizing DI yet has already started to veer off and is trying to mislead readers by dragging in information pertaining to other programs or the more general classification of basic skills programs.

Next Kohn makes an appeal to authority, another logical fallacy, when he mentions that the results were "published in the Harvard Educational Review." I'd be willing to cut Kohn some slack had he mentioned all the other journals of the research he cites. But this is the only one he cites. Coincidence? What I do know is that he failed to mention another study on PFT that was also published in the Harvard Educational Review that affirmed the findings of PFT. Another convenient oversight.

Kohn again buries the weakest the parts of his argument in a footnote. Kohn parrots the House study's findings which is based on a reanalysis of the PFT data. This reanalysis was not without it's own problems as set forth in another study by Bereiter et al which reanalyzed the House reanalysis (It should be mentioned that this study has the same potential bias problems as the House study since Bereiter had professional ties to DI before PFT).

Let us therefore consider carefully what the House committee did in their reanalysis. First, they used site means rather than individual scores as the unit of analysis. This decision automatically reduced the Follow Through planned variation experiment from a very large one, with an N of thousands, to a rather small one, with an N in the neighborhood of one hundred. As previously indicated, we endorse this decision. However, it seems to us that when one has opted to convert a large experiment into a small one, it is important to make certain adjustments in strategy. This the House committee failed to do. If an experiment is very large, one can afford to be cavalier about problems of power, since the large N will presumably make it possible to detect true effects against considerable background noise. In a small experiment, one must be watchful and try to control as much random error as possible in order to avoid masking a true effect.

However, instead of trying to perform the most powerful analysis possible in the circumstances, the House committee weakened their analysis in a number of ways that seem to have no warrant. First, they chose to compare Follow Through models on the basis of Follow Through/Non-Follow Through differences, thus unnecessarily adding error variance associated with the Non-Follow Through groups. Next, they chose to use adjusted differences based on the "local" analysis, thus maximizing error due to mismatch. Next, they based their analysis on only a part of the available data. They excluded data from the second kindergarten-entering cohort, one of the largest cohorts, even though these data formed part of the basis for the conclusions they were criticizing. This puzzling exclusion reduced the number of sites considered, thus reducing the likelihood of finding significant differences. Finally, they divided each effect-size score by the standard deviation of test scores in the particular cohort in which the effect was observed. This manipulation served no apparent purpose. And minor though its effects may be, such as they are would be in the direction of adding further error variance to the analysis.

The upshot of all these methodological choices was that, while the House group's reanalysis largely confirmed the ranking of models arrived at by Abt Associates, it showed the differences to be small and insignificant. Given the House committee's methodology, this result is not surprising. The procedures they adopted were not biased in the sense of favoring one Follow Through model over another; hence it was to be expected that their analysis, using the same effect measures as Abt, would replicate the rankings obtained by Abt. (The rank differences shown in Table 7 of the House report are probably mostly the result of the House committee's exclusion of data from one of the cohorts on which the Abt rankings were based.) On the other hand, the procedures adopted by the House committee all tended in the direction of maximizing random error, thus tending to make differences appear small and insignificant.

It is one thing to fail to mention that your source is potentially biased due to its financial ties to both some of the losing programs in PFT and the "outside" evaluators. But failing to mention that your source has been itself criticized as systematically adopting procedures which all have the effect of minimizing the direction of error, many with dubious or no scientific validity, in favor of the very programs with the corporate ties is quite another. The prudent advocate would at least attempt to explain these criticisms away, but in any event, readers should be made aware of these significant infirmities in your underlying studies.

Kohn concludes his main argument with a gratuitous swipe:

Furthermore, even if Direct Instruction really was better than other models at the time of the study, to cite that result today as proof of its superiority is to assume that educators have learned nothing in the intervening three decades about effective ways of teaching young children. The value of newer approaches – including Whole Language, as we’ll see -- means that comparative data from the 1960s now have a sharply limited relevance.

I bet Kohn wishes he could take that crack about whole language (an educational philosophy so bad it's advocates had to re-brand it to disassociate it from the lengthy trail of negative data it amassed) back now.

And, what evidence is there that today's educators have learned anything since the mid 1970's (not the 60s as Kohn claims)? The longitudinal NAEP data tells a much different story.

Next Kohn gets into the kitchen sink part of his argument. Let's take each in turn.

  1. First Kohn cites some newspaper accounts on DI. Kohn admits these accounts are anecdotal. I agree with Kohn, but I'm wondering why he included them in his argument anyway. I'm guessing lurid innuendo. I could go into detail refuting the points Kohn recounts, but it's not worth the effort for anecdotes such as these.
  2. Next, Kohn makes another fallacy of division when he states "it’s common knowledge among many inner-city educators that children often make little if any meaningful progress with skills-based instruction." Of course, there's lots of data from inner-city schools pertaining to DI that show that this statement isn't true with respect to DI.
  3. Last, Kohn claims that there is "a lot more research dating back to the same era as the Follow Through project supports a very different conclusion" and then goes about citing various longitudinal studies which purport to show better long-term outcomes for various child-centered P-3 programs (which Kohn prefers) to DI programs. Apparently, Kohn hasn't heard of "confounding variables." And, chooses to ignre th efact that many of these studies were conducted by the sponsors of these programs. And, that some had serious methodological flaws and fails to mention that the High Scope study was also sponsored by the same people responsible for one of the worst performers in PFT.

Here's Kohn's big conclusion:

Still, with the single exception of the Follow-Through study (where a skills-oriented model produced gains on a skills-oriented test, and even then, only at some sites), the results are striking for their consistent message that a tightly structured, traditionally academic model for young children provides virtually no lasting benefits and proves to be potentially harmful in many respects.

This is only true for the studies Kohn has chosen to cite which are only the "negative" ones. Kohn has basically cherry-picked the studies and excluded all the ones showing positive effects, such as Gary Adams' meta analysis and the underlying studies. Kohn also cites his research uncritically and has ignored all the criticism directed at the studies he cites. He doesn't even offer an explanation, he simply ignores them. He also ignores all the potential bias problems and methodological flaws in his cited studies. Kohn seems to have an unrealistically high standard for DI studies and a very low one for research with conclusions he likes. I find it hard to believe that any study cited by KOhn in any of his books could withstand the constraints imposed by House et al. on the PFT data.

In short, Kohn's "hard evidence" against DI appears to be almost exclusively opinion, rather than fact. Little of this opinion is supported by data, though Kohn's use of selective quotes from "research" attempts to convey that impression. And, completely ignoring all the contrary evidence and presenting such a one-sided evaluation based on that cherry-picked "evidence" is reprehensible for anyone claiming to to be dispassionate.

In this analysis of DI, Kohn has shown himself to be an untrustworthy advocate and the same pattern of scholarly malfeasance is evident in all his writings I've read.


Unknown said...

I was wondering what word one could use other than "lying" as I read your piece, but I think "scholarly malfeasance" is just perfect.

Nice work.

Anonymous said...

Yep. Seems to me you make a good case for scholarly malfeasance, and you do so this time around without attacking Kohn personally.

What strikes me is that Daniel Moynihan's assessment in 1968 is as apt today as it was 40 years ago:

"We had thought (as legislation such as Title I was passed) we knew all that really needed to be known about education in terms of public support, or at the very least, that we knew enough to legislate and appropriate with a high degree of confidence. . . . We knew what we wanted to do in education, and we were enormously confident that what we wanted to do could work. That confidence. . . has eroded. . . . We have learned that things are far more complicated than we thought. Thereafter simple input-output relations which naively, no doubt, but honestly, we had assumed to obtain in education simply, on examination, did not hold up. They are not there. (Cited in McLaughlin, 1975, p. 49)"

The logic of "planned variations" had great merit. It also had some flaws, but rather than eliminating the flaws and "doing it again, till we get it right, the Government-Academic-Publisher complex buried the methodology and drew conclusions that were directly opposite of the results. These conclusions still rule. They're alive and well in Balanced Literacy and constructivist math and in walking "authorities" like Kohn.

The 1965 Elementary and Secondary Education Act was based on better educational intelligence than the 2001 No Child Left Behind Act. And the Follow Through Study was based on better methodology than the 2008(Non)Impact Study of Reading First.

Go figure.

Anonymous said...

What does this mean?

"The 1965 Elementary and Secondary Education Act was based on better educational intelligence than the 2001 No Child Left Behind Act."

Anonymous said...

Well, that's a long story. But in short.

ESEA 1965 included five Titles:
The framers recognized that he “problems” in education were in large urban areas and with poor and minorities.
That was Title I.

The only organized ed lobbying group at the time were librarians. School Libraries are a good thing.
That was Title II.

School people were contending that “we have lots of good ideas, but they aren’t being ‘disseminated’”
That was Title III.

State Departments needed to be strengthened, so they got a slice.
That was Title V.

Having run out of "good ideas" the framers recognized that the ideas were unlikely to have any major effect. Large scale R&D was needed. Congress appropriated $100 million for the construction of Laboratories to conduct the R&D.

NCLB relied on the "New Science of Reading" which turned out to be pseudo-science. It relied on "standards" which turned out to be largely rhetoric. And it relied on statistically impossible "adequate yearly progress to achieve its goal in 12 years. Then it added a load of pork-barrel provisions.

Anonymous said...


Nice job making your case without insults.

Alfie is a professor at Harvard. There should be plenty of faculty at Harvard who are capable of understanding this argument. (Though perhaps not in the ed school.)

Is there a mechanism to censure Alfie? Can the Harvard faculty be organized to publicly reprimand him for his scholarly malpractice?

One of example of this is when scientists and economists sign public letters around important social issues, such as global warming or the need for stimulus spending.

Perhaps we could get a public letter signed by professors in fields outside of education who find Alfie's conduct offensive?

Has this been attempted before? I think the follow through folks should have been able to find lots of supports among folks outside of education who value intellectual honesty and responsible analysis of research when determining public policy.

Anonymous said...

"Alfie is a professor at Harvard."

I do not believe so.

The bio page on his website,


doesn't mention this at all. I do not believe Alfie would fail to mention that he was a professor at Harvard if it were true.

-Mark Roulo

Anonymous said...

Put much more simply, Kohn keeps his personal life cards close to the vest because he is bluffing -they wouldn't hold up to scrutiny of his own theory.

So kids whose parents held them accountable and showed tough love while they grew up are miffed at their parents. Wow, what a headline!

Sounds more like Alfie has some unresolved issues he is trying hard to rationalize into a parenting theory.