August 31, 2006

The Last Word (Hopefully) on Class Size Reduction

Never willing to let well enough alone, Mike in Texas, dredged up the class-size debate in the comments of this post again. I'm weary of the need to rehash these arguments over and over in comments battles so I'm going to make this post my definitive class size rebuttal post and just link to it when the need arises.

Executive Summary


Class size reduction in and of itself is not a proven technique for raising student performance. Class size reduction in combination with other factors does form a necessary factor in successful programs that do raise student performance. In any event, the research on classs size generally stinks; it is riddled with methodological shortcomings and typically shows small effect sizes. Given this and the high expense reducing class sizes entail along with the present teacher shortage, reducing class size does not represent a good investment for raising student performance.

I. The Hanushek Krueger meta-analyses


Both Hanushek and Krueger performed a meta-analysis on the extant class size research. For policy purposes, their conclusions are the same.

A. Hanushek (The Evidence on Class Size) (The Class Size Debate)

In a meta-analysis of 59 studies yielding 277 estimates of the effect of class size in student achievement, Hanushek found that 14.8% of these estimates were positive and significant. That is, students in smaller classes showed significantly higher achievement than their counterparts in larger classes. The remaining estimates were either insignificant (no difference in achievement – 71.9%) or negative and significant (smaller classes had lower achievement — 13.4%).

B. Krueger (Economic Considerations and Class Size) (The Class Size Debate)

Krueger reanalyzed the studies in Hanushek's meta-analysis using three alternate methods of analysis, only one of which is not controversial. Krueger found that 25.5% of these estimates were positive and significant. The remaining estimates were either insignificant (61.2%) or negative and significant (10.3%).

Much like hanushek found, a majority of studeies had insignificant or negative results.

C. Effect Sizes

The Hanushek-Krueger debate is all but academic. Here is the important finding by Hanushek that makes the debate academic:
More importantly, the estimated magnitudes are very small. A class size reduction of 10 students, which approximately cuts average class size in half and represents a 2½ standard deviation movement, is never estimated to yield more than 0.12 standard deviations improvement in student achievement for the results that are statistically significant. When results are separated for students eligible for free or reduced lunches, the performance of disadvantaged students is found to be more sensitive to class size: A 10 student reduction in class size reductions could yield as much as 0.19 standard deviations (in fifth grade math performance). Estimated class size effects for students ineligible for free or reduced lunch are, however, less than half the size of those for disadvantaged students and are more frequently insignificant.
This finding is not in dispute.

It's generally recognized that effect sizes smaller than about 0.25 standard deviations are not educationally significant. In other words, in the real world it is not worth pursuing an intervention whose research only procurred having such a tiny effect size. To put it in persepective, in theory your typical Title I school performing at the 20th percentile (80% of the kids are not at grade level) would only be boosted to the 28th percentile (72% of the kids below grade level) with the application of an intervention having a 0.20 effect size. In reality the increase would be less than that.

II. Project Star

The big kahuna of misrepresented, misinterpreted, and misued education research -- Project Star.

Project Star should be the poster child of what ails education research. To educrats Project Star is now slogan -- a talisman they can wave that proves that class size reduction is the panacea for our education woes. Hardly.

Let's start with the methodological flaws in the research which are signficant:

  1. that not all students started the experiment at the same time, because kindergarten was not mandatory or universal in Tennessee;
  2. sizable attrition occurred over the course of the experiment because of mobility and other factors, and this attrition was likely not random;
  3. parents,teachers, and schools knew they were part of an experiment and, because of pressures from parents, part of the experiment was compromised by re-assignments of students;
  4. no achievement tests were given before kindergarten, making it difficult to analyze whether elements of the random-assignment process contributed to any subsequently observed achievement differences;
  5. approximately 6 percent of the students were transferred across treatment groups at the end of the first year of the experiment; and
  6. there was some drift from the target class sizes of 15 and 22 so that there is actually a distribution of realized class size outcomes over time in both treatment groups.

Each of these issues has been raised by the initial researchers (e.g., Finn and Achilles, 1990) and by later interpreters of the results (e.g., Mosteller (1995) and Krueger (1997)), but the experimental data do not provide information that permits fully ascertaining the effects of such possible problems.

These methodological flaws remain unrebutted and go right to the heart of whether project STAR is even valid research. But, let's not get caught in that thicket. We don't need to even go there. Once again small effect sizes come to our rescue.
The Lasting Benefits Study, which has traced students after the end of the STAR experiment, showed that students from the small K-3 classes maintained most of the prior differences through the sixth grade (Nye et al., 1993). Comparisons of small versus regular classrooms yielded effect si
zes on the norm-referenced third grade tests of 0.24 and 0.21 for reading and math, respectively (Word et al., 1990). In the sixth grade, three years after the end of any differential resources for the two groups, the effect sizes for comparisons of students previously in small versus regular classrooms were 0.21 and 0.16 for reading and math, respectively (Nye et al., 1993). In other words, the differentials in performance found at kindergarten remain essentially unchanged by third grade after class size reductions of one-third were continuously applied (see figures 6 and 7) and remain largely unchanged by sixth grade after class size returned to its prior levels for another three years. This latter finding leads to rejection of the fall-back model and indicates that class size reductions after kindergarten have little potential effect on achievement
Once again we are confronted with educationally insignificant effect sizes. And even these miniscule results required reducig class sizes down to 13-17 students. And, don't forget about all the methodological flaws casting serious doubts on even these reults.

See tables 2 and 3 (pp. 20-22) of this study for a complete list of effect sizes obtained in STAR.

III. Student Achievement Guarantee in Education (SAGE)


Think of SAGE as STARs ugly step-sister. Although, it is commonly believed that SAGE evaluated the effects of reducing class sizes, it did not. Class size reduction was only one component of the intervention. Here are some of the other components:
  1. a longer school day and increased collaboration with community organizations
  2. a more rigorous academic curriculum
  3. staff development and accountability mechanisms
So which of these factors contributed to increased student achievement? We don't know. If you want to implement the SAGE intervention, it's a package deal. You have to do class size reduction plus the three other not-insignificant components.

So, what's your reward at the end of the day? You guessed it -- the usual, small effect sizes:

Effect Sizes After Three Years in SAGE
Math: 0.193
Reading: 0.136
Language Arts: 0.350
Total: 0.213

Once again we have mostly educationally insignificant effect sizes.

Conclusion

The research, such that it is, simply does not suppport the commonly held notion that reducing class sizes will have an educationally significant effect on student performance. There are many other reasons why teachers are giddy for reduced class sizes, but improved student performance is not one of them.

This seems counterinuitive, even paradoxical. I agree. Let me offer an explanation. Ineffective teaching practices are relatively immune to class size changes. The system is just as broken with 15 kids in the class as it is with 30 kids. Throwing a few more kids in a classroom isn't going to break the sytem any more than its already broken. Teachers generally present material inefficiently and without regard to whether students are actually learning. Certainly some kids are learning the material -- the same top 25% who've always learned no matter how poorly the material is presented. Gaining a few extra a minutes a day of teacher time through reducing class size isn't enough to significantly alter this dynamic. As a result failure persists.

On the other hand, effective teaching practices are very sensitive to class size issues. I'll discuss those conditions in part II along with why I ultimately conclude that teacher presentation size is a critical component to increasing student achievement in highly effective instructional programs at least at the elementary school level.

Take me right to Part II.

15 comments:

Mike in Texas said...

Here is a review of Hanushek's work:
http://www.irs.princeton.edu/pubs/pdfs/451.pdf

"Hanushek extracted 277 estimates of of the effect of class size from 59 studies"

Gee, don't supposed he introduced a little bias into the results.

Oh wait, he did.

"The number of estimates from each study varies widely: as many as 24 estimates were extracted from each of two papers (which used the same data set), and only 1 estimate was extracted from 17 studies apiece.

Gee, pull 24 estimates from a couple of studies I like, and only 1 from studies I don't, that should make it all accurate. NOT

Hanushek is an economist. Its amazing how many of them seem to think they are experts on education.

BTW, the STAR project is one of the most thoroughly examined education studies in existence. It is also regarded as one of the most sound studies.

Its cute how you think you're the definitive source on all education subjects.

KDeRosa said...

Here is a review of Hanushek's work:

Yeah, MiT, that's the Krueger study I cited. Now go back and read what I wrote about it. No wait, first go back and learn what "effect size" means first then go go back and read what I wrote, that way you might understand it this time around.

Gee, don't supposed he introduced a little bias into the results.

Actually, that's higher than I expected. Usually 90% of education studies are junk science, even though I realize you don't understand what this means by your comment.

Gee, pull 24 estimates from a couple of studies I like, and only 1 from studies I don't, that should make it all accurate. NOT

So then do it the way Krueger did; The results aren't much better. And the effect size is still low.

BTW, the STAR project is one of the most thoroughly examined education studies in existence. It is also regarded as one of the most sound studies.

Actually, many other independent researchers have raised the same issues that Hanushek did. Only dopey educators like you still take the STAR project at face value. Can you say confirmation bias.

Of course, no one disputes the small effect sizes that resulted from STAR either. So even if you still want to maintain your sad delusion that STAR is good research, you still need to explain away the fact that it found that lowering class size had an educationally insignificant effect -- another simple term you don't seem to understand.

Its cute how you think you're the definitive source on all education subjects.

Beats making a fool of yourself down in the comments by displaying an utter lack of understanding about the scientific method like you just did.

If you undertood why, you'd be embarrassed right now, though I'm sure you're not.

Mike in Texas said...

You claim the benefits are not significant. The United States govt. disagrees with you.

http://www.ed.gov/pubs/ReducingClass/Class_size.html#research

Here are some tidbits from the above link:

"A consensus of research indicates that class size reduction in the early grades leads to higher student achievement."

"The research data from the relevant studies indicate that if class size is reduced from substantially more than 20 students per class to below 20 students, the related increase in student achievement moves the average student from the 50th percentile up to somewhere above the 60th percentile. For disadvantaged and minority students the effects are somewhat larger."

"Differences among the three class types were highly statistically significant for all sets of achievement measures and for every measure individually. In every case, the significance was attributable to the superior performance of children in small classes, and not to classes with full-time teacher aides."

Notice the use of the word significant in that last paragraph. I guess they didn't hear you decided what signficant means and they disagree.

If you scroll down there is a table listing different effects, and reading for minorities is listed as a positive 0.35 effect after 4 years of small classes, which tops your definition of significant and meaningful.

"The results of both projects favored small classes in academic achievement small-class effect sizes were in the range 0.4 to 0.6 "

Wait a minute!! That EASILY tops your definition of significant!

"Teachers of small classes spent significantly more time on task and significantly less time on discipline or organizational matters compared with teachers of regular-size classes."

Whoops, there's that word again! Gee, being able to spend more time teaching and less time on discipline and organizational matters might effect classroom learning.

KDeRosa said...

You claim the benefits are not significant.

That's not what I claimed. I claimed the benefits were not educationally significant (less than a 0.25 standard seviation effect size).

"The research data from the relevant studies indicate that if class size is reduced from substantially more than 20 students per class to below 20 students, the related increase in student achievement moves the average student from the 50th percentile up to somewhere above the 60th percentile. For disadvantaged and minority students the effects are somewhat larger."

That tidbit overstates what the valid research actually shows. But even a ten point shift in student achievement only represents a 0.26 effect size. This is considered to be a small effect size. It is barely educationally significant. And due to the the tremendous costs associated with reducing class size down to the level to achieve even these meager gains, you'd have to be nuts to effect such an intervention.

Notice the use of the word significant in that last paragraph. I guess they didn't hear you decided what signficant means and they disagree

It means they are being sloppy with their language. Statistical signficance and educational signficance are defined terms in research.

After that, MiT, you must be missing a link because all the rest of the quotes you are citing don't come from the link you've given.

KDeRosa said...

Ok, I found your missing link.

Not sure where they are pulling that table data from, but it doesn't comport with the effect sizes reported in the actual Project Star summary report. See p. 21, table 7.

Notice the use of the word significant in that last paragraph. I guess they didn't hear you decided what signficant means and they disagree.

They are talking about statistical significance, MiT. With large sample sizes, as in Project Star, even small mathematical differences can be statistically significant. So, a small effect size of say 0.10 could be sstatistically significant, but it still is too small to be educationally significant.

MiT, these are important concepts in education research, you really need to do your homework and learn what they mean.

Then you go an uncritically cite the Success Starts Small "Study" which followed four whole teachers. This is what we call a small sample size. And then we have this handy little confounding variable.

"During the first year, the teachers received twenty hours of “staff development studying strategies for more active learning for six-year olds.” They also visited small-size classrooms in surrounding districts and met on a weekly basis to debrief and plan. They attended seminars and were introduced to computer-based learning. These teachers focused on finding ways to work with all children."

I'll leave it as an exercise for you to figure out why this would be a problem and why those outlier effect sizes are problematic.

Unknown said...

Given that I have multiple sections of 250+ students each, I don't have much sympathy for the call for reduction. Somehow, we manage to get through the material, and my students test scores follow a reasonably normal distribution.

I did laugh when I saw the appeal to authority argument about the US Government. Funny. That was a joke, I hope, since any college freshman should know why that is not an argument.

Kilian Betlach said...

KdR

We've been around this bush before, but hey, why not? I'm waiting to see the study that controls for teacher effectiveness.

"Ineffective teaching practices are relatively immune to class size changes. The system is just as broken with 15 kids in the class as it is with 30 kids. Throwing a few more kids in a classroom isn't going to break the sytem any more than its already broken. Teachers generally present material inefficiently and without regard to whether students are actually learning."

Sure, but do you think a teacher presenting material effectively to a group of 33 (or 40 or 45) will be just as effective as if the size was 15?

Your critiques are valid, but they do not impune class size reduction per se, but rather reveal it to be a generation-2 reform, effective only after other measures (teacher quality, for example) have been addressed.

KDeRosa said...

Hi TMAO.

You must have skipped the executive summary:

Class size reduction in and of itself is not a proven technique for raising student performance. Class size reduction in combination with other factors does form a necessary factor in successful programs that do raise student performance.

Not to mention my part II teaser:

On the other hand, effective teaching practices are very sensitive to class size issues. I'll discuss those conditions in part II along with why I ultimately conclude that teacher presentation size is a critical component to increasing student achievement in highly effective instructional programs at least at the elementary school level.

Anonymous said...

"Sure, but do you think a teacher presenting material effectively to a group of 33 (or 40 or 45) will be just as effective as if the size was 15?"

Let me throw another issue into the mix. Do you think that the effectiveness (or lack thereof) of class size applies equally to all topics? Why would class size affect, say, a physics lecture? What is the difference between, say, 100 students (a small lecture) and 350?

I suspect, though I could just be being a mean old hatemongering evil bigot again, that about 90% of this howling about class size boils down to one thing: less workload.

But I could be wrong. I doubt it, but it's possible.

KDeRosa said...

Beginning reading is a good example of difficult subject that requires smaller class sizes.

Class size should be a function of student ability. Lower preformers need more teacher assistance. Higher performers can tolerate larger class sizes. Even the class size research bears this out -- minorities do benefit more from reduced class sizes whether through greater teacher attention or by reducing discpline issues.

As kids learn how to learn, class size will play less and less of a factor.

By the high school level I'd say that uyou're much better off in a large class with a good teacher than a small class with an average one. This is also true to a lesser extent in the lower grades.

Kilian Betlach said...

I read it KdeR; my point is I feel like you omitted the word "yet." As in "class size reduction does not represent a good investment... yet." There are other areas in need of reform first, but that doesn't mean the entire concept is invalid, as you pointed out.

Rightwingprof,

To me, it's more about quantity and quality of interaction than workload. I'll grade until the cows come home (but know that as long as I'm grading I'm not working to develop better practices), but I want more and deeper interactions with my students. I teach something research is starting to call "long-term English Language Learners," kids who picked up survival English in K-1 and never progressed. I need them talking and interacting with material and myself, and the fewer kids in the room, the better able I am to have those interactions. I can do the job with 33, but I can do a better job with 15.

Anonymous said...

33? 15? I'm talking about 250 students. What kind of interaction is possible? Sure, there are students who ask questions and participate, but we can't have a big group hug session. And how do we reduce these classes to 15 or 33? Who is going to teach them? We'd have to triple the number of faculty, and how realistic is that?

Natural sciences and math departments at universities are already over a barrel because they teach a large number of mandatory (for all students) courses, and have very few native English speaking grad students. The result is that endless debate over "I can't understand my teacher." But what else are they going to do? Who is going to teach those classes? The Americans coming into the university have been cheated out of the necessary math and science skills to be grad students, and teach the classes. Add to all that the fact that tenured class loads are very small, and they're not going to increase.

Anonymous said...

Rightwingprof,
You see no difference in teaching college and elementary school? 250 + students in a college course is understandable, if you believe in college professors as the masters and providers of knowledge because the students can't learn it for themselves, but elementary kids are at the beginning of a long educational career and need to be taught how to learn, not how to listen to someone lecture for an hour while taking copious notes for an exam or term paper.

Anonymous said...

Reducing class size is one of the few educational reforms shown to improve student achievement and narrow the achievement gap between ethnic and racial groups. In fact, the federal government cites smaller classes as one of only four “evidence-based” strategies that rigorous research has shown to improve learning.

(See http://www.ed.gov/rschstat/research/pubs/rigorousevid/index.html)

Moreover, the effect sizes for black and poor kids in the STAR research are not small -- in fact, being in a small class in K-3 essentially reduced the achievement gap by achievement gap by about 38%.

In the Tennessee STAR study, only 16.7 % of inner-city students placed in small classes in the early grades were retained through the 9th grade, compared to 43.5 % of those from similar backgrounds who had been in regular size classes.

In 4th, 6th, and 8th grade, students who attended small classes in the early grades were significantly ahead of their regular-class peers in all subjects. By 8th grade, they were still almost a full year ahead of their peers.

For those who attended a smaller class in grades K-3, the gap between black and white students who took college entrance exams was cut in half.

The only reason some people continue to dispute the benefits of smaller classes is that they don't want to actually spend the money to improve public education.

KDeRosa said...

No one disputes that reducing class size combined with more effectively teaching to that smaller class sizes increases achievement. The dispute is that the effect size is small and the cost is high. The bang for the buck is small.

As I've pointed out, and hopefully you've read, the STAR study is not without its methodological flaws especially when it comes to the minority achievement numbers. Even so, the effect size is in fact considered to be small and barely educationally signficant.

The only reason some people continue to dispute the benefits of smaller classes is that they don't want to actually spend the money to improve public education.

No, the reason it's disputed is because the bang for the buck is so small.

If you are so concerned with raising minority student achievement why would you not rather change the curriculum to direct instruction of success for all, both of which use small instructional groups, and have effect size that are 2 to 4 times that achieved in STAR at a fraction of the cost?