March 13, 2008

Statistical Illiteracy

Here's a nice example of some dubious reasoning based on unwarranted statistical assumptions.

Iris C. Rotberg, a professor of education policy at George Washington University, said any comparisons based on international tests, such as PISA, would be more reflective of the poverty in a state—or country—than of the quality of its schools or teachers.

“Making more comparisons and having more tests won’t solve the basic problem: We have a lot of kids living in poverty,” she said. “Governors can probably predict what their test scores will look like.”

Comparing poverty effects and school/teacher quality effects cross-nationally is fraught with problems. Each country defines poverty levels differently and I'm not sure anyone has devised a methodology for rating school/teachers that is objective, comports with reality, and has predictive value.

The OECD botched-up their analysis of SES effects for the 2003 PISA. No doubt this report forms the basis of Professor Rotberg's dubious comparison.

Fig 4.8 (p. 176) of this 2004 PISA report is a pretty graph of the relationship between math performance and SES of OECD countries.

See the pretty trend line? Notice how the data is a poor fit to the trend line? Notice how the best of the low-SES students outperformed the worst of the high-SES students?

If you turn to Table 4.3b, col. 3, p. 398, Annex B1 you see that the OECD Avg R square is only 17.9%.

This means that only 17.9% of the variance in student performance is accounted for in the variance of SES. Notice how I was careful to avoid an implication of causality, something that the OECD and Professor Rotberg take a rather cavalier attitude toward.

This means that 82.1% of the variance in student performance is not accounted for by the variance in SES. Other factors, such as school quality, account for this remaining 82.1%

The OECD tried to quantify school effects on student performance. Somewhat unsuccessfully. (I challenge anyone to read Fig. 3.6 without giggling.)

It may be the case that school effects do account for less than poverty effects. Poverty effects account for less than 18%. It could be that school effects account for even less. Certainly, most schools are clueless when it comes to educating lower-SES students. But that is not to say that the few highly effective schools have a greater effect. The data is few and far between.

According to Rotberg, the basic problem is poverty which is more reflective of (comparisons involving) testing results. This implies causation when all we have is correlation data. And the correlation between poverty (SES) and testing results is weak (17.9%). Why focus on this small factor when 82% of the variance is attributable to other factors? It's not like anyone's been successful raising student achievement by increasing a student's SES.

Two final notes.

1. You can add the term "a lot" to the Downes Lexicon of Misrepresenting Statistics Through Language. Apparently, "a lot" means 22% -- the largest figure I could find for childhood poverty in the US (as long as you don't mind excluding non-cash benefits from "Income."). Remember, "often" and "many" are 16% and "a lot" is 22%. We're going to have to make-up some new words to represent truly high levels or frequencies since most of the existing words are quickly being used up for low incidence events. I want to recommend "super duper" for 80% and "shiiiit!" for 90%.

2. According to table 4.3b, the R2 for SES effects in the U.S. is 23.8%. I calculated the R2 to be 31.2% for parental education levels in Pennsylvania relative to student performance. I'm still waiting for Stephen Downes to explain to me why Pensylvania's scores don't generalize to the rest of the U.S.

(Hat tip to Charles Barone)


Tracy said...

I want to recommend "super duper" for 80% and "shiiiit!" for 90%.


Downes said...

If 22 percent of the population went downtown and protested, we'd say "a lot" of people protested.

In new York, that would be just under 2 million people. That's "a lot", isn't it?

Across the United States, it's almost 70 million people. That's "a lot" isn't it?

There are roughly 73 million children in the U.S.
That's about 16 million children.

The tenor of this blog is such that the author can look at 16 million children living in poverty and say, "That's not a lot."

And somehow he thinks that he stakes out the moral high ground.

KDeRosa said...

Let's see if I get this right.

If 290 people came over my house to visit, that would be "a lot" of people.

And, if 290 million people tried to cram their way into my house, well, that'd be "a lot" as well.

But, in a country like the US with 300 million people, 290 people represent nearly none of the population, while 290 million people represent nearly all of the population.

So, a lot = nearly none = nearly all.

Gee, that makes a lot of sense.

Let's say there's a million to one chance of my contracting a rare disease in any given year. In the U.S. about 300 people would contract this disease every year. That's a lot of people. Even the rarest of events yields a lot of people in countries like the US.

And of course historically speaking, child poverty in the US is lot lower that in was 50 years ago adn a lot lower than it is in most of the rest of the world today, expecially if you measure poverty on an absolute scale instead of a relative scale, and if you and in all the forms of "income" we currently exclude.

So, while we may excuse the layman for employing such sloppy terminology, the Washington Post journalist, the Profosser of education policy, Stephen Downes, and me are supposed to hold ourselves to a higher standard when opining on statistical matters unless, of course, we are trying to convey a false impression.