October 8, 2008

The WWC falls down on the job again

The What Works Clearinghouse (WWC) does a noble job of identifying much of the junk science research that plagues education research and masquerades as real research. The WWC, however, is not without its faults.
I have noted at least two instances in which the WWC has given its imprimatur to very questionable research.

In August, the WWC released a report on Reading Mastery-- one of the most researched reading programs in existence. Despite the fact that other reputable organizations have found that much of the Reading Mastery research base passes scientific muster, the WWC did not find a single study that met its standards. Clearly something was amiss.

The author of Reading Mastery, Zig Engelmann, has just weighed in on the WWC's latest shenanigans -- Machinations of What Works Clearinghouse. Basically Zig says that WWC failed to locate a large portion of the extant post 1985 Reading Mastery research base, improperly excluded the entirety (38 studies) of the pre-1985 research base, and used dubious criteria for excluding at least one study it did consider. I suggest you read the whole thing. I'll elaborate on two points that Zig raises.

Dubious Rationale for Excluding Pre 1985 Research

The WWC arbitrarily limits its research review to studies reported no earlier than 1985 (unless the WWC principal investigator deems the study important enough to report). This 1985 cut-off makes little sense. Beginning reading performance hasn't changed much since 1985. In fact, we have readily available evidence that it hasn't changed much since as early as 1971. That evidence is the NAEP Long-Term Trend in Reading test data (not to be confused with the plain ol' NAEP test which changes frequently). Here's a graph of the performance of nine year olds (4th grade):




As you can see, the performance of nine-year olds in reading has stayed remarkably flat during the period 1971 - 2004 with little difference between pre-1985 scores and post-1985 scores. My back of the envelope calculation is that the change between 1971 and 1999 is less than a quarter of a standard deviation, i.e., not educationally significant. In fact, scores in 1980 were higher across the board than they were in 1999. Only in the post-1999 do scores rise above the 1980 high-water mark.

Since we have reliable data going back to 1971 showing similar performance in early reading, there is no compelling reason to arbitrarily set the cut-off at 1985. The rationale the WWC offers is lame:

... the fact that preschool enrollment has increased, combined with the fact that more preschool and kindergarten programs run full-day, means that students in the early grades may be better prepared to receive reading instruction today than students 25 years ago. Moreover, it is possible that any changes in reading readiness over this period may not have been evenly distributed, since differences in reading ability by socioeconomic status and race are apparent at the kindergarten level . . . Any of these changes could have implications for the effectiveness of an intervention. If school readiness has increased, then an intervention that was effective 25 years ago may not be effective in more recent years. (p. 2, Appendix A)

Perhaps the WWC hasn't heard, but there isn't any evidence that preschool, full-day kindergarten, and Headstart provide any lasting effects that don't quickly fade out. In fact all of the potential causes given by the WWC (for none have been confirmed by research) must be superficial and superfluous to reading performance, since the NAEP data shows that none of them have had a significant effect on reading performance.

This is a somewhat embarrassing admission coming from the WWC what with its lofty evidentiary standards and all. I also suggest you read Zig's evisceration of this argument which concludes:

The assertion that the children are better prepared now and therefore what was effective 25 years ago might not be effective now is logically impossible. Lower performers make all the mistakes that higher performers make. They make additional mistakes that higher performers don’t make and their mistakes are more persistent, more difficult to correct. Therefore, if the program is easier for them now because of their higher degree of undefined ―readiness, they will make fewer mistakes and progress through the program sequence faster.

...

[B]eginning reading for grades K–3 is stable because nothing of significance has changed in the last 40 years. The instructional goal is the same—to teach children strategies and information that would permit them to read material that could be easily covered with a vocabulary of 4,000 words. The frequency of these words has not changed. The syntax of the language has not changed significantly. For these reasons, the content of the first four levels of Reading Mastery has not changed over the years.

I am not aware of any properly conducted scientific research which has a shelf life of only 20 years. Research doesn't go bad. I'm not going to stop taking penicillin based drugs while the research gets updated because the basic research was conducted 80 years ago. And, I see little reason for the WWC to exclude any properly conducted research on Reading Mastery, such as the Project Follow Through, or for any other educational program for that matter.

Dubious Confounding Factors

It's bad enough that the WWC failed to even locate, much less consider, a majority of the extant Reading Mastery research. It's even worse that they set an arbitrary cut-off date that excluded at least 38 studies on Reading Mastery. However, improperly excluding a study (which otherwise meets all the selection criteria) based on the fact that the new teachers were provided initial training goes beyond the pale.

The RITE study (Carlson and Francis, 2002) which involved 9300 students and 277 teachers (Zig claims that it is "probably the second largest instructional study ever conducted (after Project Follow Through") met all of the WWC exceedingly high selection criteria. However, the WWC excluded the study because "support [was] provided to teachers through the RITE program" which the WWC believes to be a confounding factor. Here's the confounding "support" the teachers received:

This support consisted of summer training, less than two hours of monitoring during the year, and help from a designated trainer. Nearly half of the teachers (137) were in their first year of teaching Reading Mastery. The training focused on how to provide positive reinforcement, how to correct specific errors, how to organize and manage the classroom so that one small group is in reading instruction while the other two groups are engaged in independent work and are not disrupting the instruction... The teachers were trained to teach Reading Mastery exactly the way the [Teacher's] Guide describes it, with all the technical details in place.

This is not only a ridiculous reason for excluding an otherwise acceptable study, but also against the WWC's own protocols which permits the inclusion of "commercial programs and products that [have] an external developer who: Provides technical assistance (e.g., provides instructions/guidance on the implementation of the intervention)." (p. 6, Protocol)

The WWC excluded many other otherwise acceptable Reading Mastery studies based on "confounding factors." I wonder how many were confounding factors related to initial training like the RITE study. I know that more than one study was excluded because the control group initially performed at least half a standard deviation above the Reading Mastery group, yet despite this advantage, the Reading Mastery group outperformed the control group by the end of the study. I'm thinking that the magnitude of the effect size more than compensates for the reliability issue caused by initial discrepancy which favored the control group.

In any event, there you have it. The WWC failing to do their job properly yet again. This is beginning to become a pattern.