April 19, 2008

Reasoning and Writing Cont'd

In the previous past, a commenter asked for an example of a later lesson from Reasoning and Writing. Here is one of the last lessons from Level E and would normally be covered at the end of fourth grade.

The students have been taught a procedure for testing the hypothesis set forth in the passage and writing a two paragraph essay analyzing the hypothesis. It's not surprising that you don't see anything like this coming out ed schools.

April 18, 2008

Reasoning and Writing

One of the programs being used in Gering is the DI writing program Reasoning and Writing (RW). Here's an overview of the program.

Levels A and B get students ready for real writing. Level C concentrates on narrative writing. Level D focuses on expository writing. Levels D through F are no walk in the park, even for higher performing kids. In fact, I'll go so far to say that most people are never taught and never learn the skills taught in the latter levels of RW.

Most people are simply unable to critically example text or a presentation of information and construct a coherent argument based thereon. Reading blogs and the comments section makes this fact abundantly clear. And the demographic that engages in such activities is highly skewed at the top of the cognitive curve.

What I like about RW is that it teaches grammar and logic in the context of writing. As students learn grammar they immediately incorporate what they learn into their writing assignments. Every lesson has a writing assignment.

For example, by level C, lesson 89, students are learning how to properly use pronouns in writing (jack and Jill = they), how to punctuate a series of nouns (bat, ball and glove), and how to construct paragraphs with multiple people speaking (a paragraph per speaker). The writing assignment for this lesson is based on the following series of pictures.

The assignment is for the students to draft a multiple paragraph narrative based on the sequence of events that take place in pictures 1, 2, and 4, including what must have happened in missing picture 3.

The students start off by setting the scene by writing where the people were, what they were doing, and what was happening in the background in the first picture.

Ann, Maria, and Tony were sitting on a bench at a bus stop. A house was burning behind them and smoke poured out of the window. Tony said, "I smell smoke."

Next, the students write about what happened in the second picture.

They stood up and faced the building. Ann pointed to the window and said, "It's coming from over there."

Tony said, "There's a dog in that window."

Finally, the students write about what must have happened in the missing third picture and what happened in the last picture. The students have learned to identify the differences between the second and fourth picture to determine what must have happened in the missing third picture. In this example some differences are the location of the children, what they were doing, and the presence of Mrs. Wilson.

They went over to the window. Ann climbed onto Tony's shoulders, reached into the window, and grabbed the dog. Ann held the dog in her arms. Mrs. Wilson arrived home holding a bag of groceries. She said, "You saved King."

As she ran to a telephone booth, Maria said, "I'll call the fire department."

After the students complete the writing exercise, the class discusses some examples from the students' writing. The teacher notes some areas that the students should have included in their writing and some common errors. The students fix-up their writing and turn it i tothe teacher for review. The teacher reviews the writing by the next lesson and reviews the previous assignment in the beginning of the next lesson.

By the end of level C (lesson 110), students will be able to construct simple multi-paragraph narratives based on a sequence of events like this example.

As you can hopefully see, this lesson has far more instructional value that the typical journaling exercises elementary school students typically engage in.

And, yes, I wrote those paragraphs all by myself. The question is will my second grader do a better job than me when he does this exercise tonight.

Update: Here's what the boy wrote for this lesson. I only corrected a few spelling errors.

Ann, Maria and Tony were sitting on a bench at a bus stop. There was a house on fire behind them. "I smell smoke," said Tony.

Maria, Ann, and Tony stood up and looked behind them and saw a house on fire. "It's coming from over there", said Ann.

"There's a dog in that window," said Tony.

They ran to the house and Ann got on Tony's shoulders. Then Ann grabbed King while old lady Wilson came with groceries saying, "You saved King."

Maria ran to the telephone booth saying, "I'll call the fire department."

April 17, 2008

More Gering Data

(Continued from previous post)

I know the in DI, they hit language instruction hard. Think of it as the "content" course you need before you get to the other content courses. Language is content.

Not unsurprisingly Gering is posting better language results.

Take for example example these scores from the Terra Nova Language test which show Gering fifth graders who received 3 yrs of DI language arts instructions compared to Gering seventh graders who did not.

That's even better than the reading scores.

How about some scores from Gering second graders on the Gates-MacGinitie Word Knowledge test.

The 2007 cohort received three years of DI instruction (grades K-2), the 2006 cohort received two years (grades 1-2), the 2005 cohort received 1 year (grade 2). Either the 2007 cohort had a lot of smart kids compared to the other cohorts or they learned a lot more words.

There's also been a lot of talk about NCLB causing schools to neglect the non-reading and non-math classes, such as social studies and science. Certainly, in DI a strong focus is placed on English Language Arts and Reading, but Gering doesn't appear to be suffering in these areas.

These are Terra Nova Science and Social Studies for 3rd and 4th grades. Remember you need to be able to read the tests in order to correctly answer questions. That point seems lost on many edu-pundits.

Finally, we turn to writing scores. The following scores show the improvement that Gering Hispanic students made in comparison to other Hispanic students in Nebraska as each cohort received more DI instruction in writing.

Gering went from significantly performing below the state average to performing better than the state average.

April 16, 2008


Back in January I was confused over the Washington Post's claiming that the the D.C. public schools only spent $8,322 per pupil. Big city public schools in the U.S. are some of the most lavishly funded schools in the world and most likely in the history of mankind. Though you'd never know it by the dilapidated facilities and floundering student achievement.

Andrew Coulson of the Cato Blog has crunched the numbers for D.C. and came up with a per pupil spending number of $24,606. That's obscene.

There are quite a few people that still think that you can improve education outcomes by just throwing a few more dollars at the problem. I don't think these people realize just how much money they'll need to throw at the problem because $24,606 doesn't appear to be enough.

Today's chart courtesy of data from the U.S. Census should give you an idea just how much money we continue to throw at the problem year in and year out.

These are total expenditures in constant dollars (i.e., adjusted for inflation) for public schools. In other words, this is what we actually pay per student based on daily attendance numbers. Basically, the amount we spend on public education has doubled in real dollars since the early seventies.

Has any other product or service doubled in price since then?

Not gasoline.

Carnival of Education

The 167th Carnival of Education is up at the CEA Blog. Go check it out.

April 15, 2008

New Features

I just added a feed widget to the sidebar over on the right, just below the "About D-Ed Reckoning" widget at the top.

I also added my first feed link to bloglines, an excellent reader and the one I use. Here's a fully functioning copy. Go ahead click it, I dare you.

Subscribe with Bloglines

If you have a bloglines account, clicking on the button should add the highly coveted d-ed reckoning feed to your bloglines feeds.

Mind you, the feeds have existed for quite some time. Now, they will hopefully be easier to subscribe to.

I'll add more buttons as time permits.

Update: Done and Done.


I know that a lot of very knowledgeable people read this blog. And, just because I disagree with some readers/commentors doesn't mean that I don't think they aren't knowledgeable. (How's that for a triple negative?)

In any event, I'd like to know what you think are decent standardized tests for academic subjects such as reading and math. I'm especially interested in hearing your opinions on testing the various aspects of reading ability.

What tests are good for measuring the mechanics of reading, such as decoding ability, vocabulary knowledge and fluency? Are there any reliable tests. how about the end-product of reading instruction -- reading comprehension?

What about math? What's a good test for skills students should possess at the end of elementary school and/or to see if they are ready for algebra?

What about tests for history, geography, science?

Be sure to list the tests' weaknesses along with the advantages.

More Results Out of Gering

(This is the third post in the Gering Series.)

Thanks to the DI implementation and lots of hard work by the district, Gering Public Schools has managed to close some of the achievement gaps between Hispanic students and white students. In Gering, Hispanics represent a substantial minority of students, nearly 30%.

Here are Kindergarten scores from the DIBELS phoneme segment fluency test for Hispanic and white students before and after the DI implementation.

As you can see, the percentage of Hispanic students passing the test as increased by an amount sufficient to close the achievement gap with the white students whose pass rate has also increased.

Here are second grade scores from the DIBELS oral fluency test for Hispanic and white.

Again, note that white students have made significant gains compared to previous cohorts, but Hispanics have made even greater gains--gains sufficient to create a reverse achievement gap.

Bear in mind that closing the gap with respect to the number of students passing the benchmark is not the same as closing the achievement gap with respect to absolute achievement scores. It could be that the scores of white students are still higher than the scores of Hispanic students. I don't have data to report either way, at least not yet. But, keep in mind that NCLB is concerned with the closing the gap between the percentage of students meeting benchmarks, not with absolute scores.

This is how NCLB is supposed to work. Schools are supposed to be improving instruction such that student achievement is improved for all groups with the effect that more students from lagging groups will pass the benchmark and close he gap. This is how it is working in Gering.


Cost of Public Education Rising Faster Than the Cost of Gasoline

Readers know that I like a good visual aid. So, it shouldn't be surprising that I like this graph from Carpe Diem:

The chart shows that the retail price of public education per pupil has risen faster than the retail price of a gallon of gasoline.

I like the chart because it puts the steep rise in gasoline prices in contrast to the even steeper price of public education. And, of course, I can choose to forgo gassing up my gas-guzzlin' SUV, but I'm paying the price of public education whether I want to or not.

PS: You can always spot an effective post by the presence and strength of a Stephen Downes' argument in the comment thread. Stephen seems to be compelled to answer every blog post he disagrees with, a Sisyphean task if ever there was one. Unfortunately, he doesn't always have a good/persuasive argument to respond with.

April 14, 2008

Some Results Out of Gering

Following up on my previous post on Gering Public Schools which have implemented the whole school Direct Instruction (DI) reform with the help of The National Institute for Direct Instruction (NIFDI) beginning in the 2004-2005 school year. The intervention has been in place for three years so far and is beginning to generate some useful data.

The above graph shows DIBELS Nonsense Word Fluency (NWF) and Oral Reading Fluency (ORF) test data for grades K-6 from the spring of 2004 (before DI) and in the spring of 2007 (after 3 years of DI). These DIBELS tests are good predictors of the risk of student reading failure in subsequent grades. The graph shows the percentage of students meeting the benchmark goals. Meeting the benchmark indicates a low risk of reading failure.

The students in K-2 have been in DI since they began school. The students in gardes 3-5 have received three years of DI, i.e., for example the fifth grade students received DI in grades 3-5, but did not receive DI in grades K-2. The students in grade 6 only received a single year of DI in sixth grade.

As you can see from the graph, all grades, despite many students not receiving DI in each gradehave made substantial gains and their risk of reading failure has been significantly reduced. The kindergarten class of 2007, for example, is performing above the top 1 percentile based on these scores.

Here is a graph of grades 1-5 showing the performance of economically disadvantaged students on the same tests.

Not unsurprisingly, in 2007, the low-SES students in grades 1-5 outperformed the low-SES students in 2004. What is surprising is that the low-SES students in 2007 also outperformed all 2004 students in each grade, not just the low-SES students. That's impressive. So much for socio-economic status being a determining factor of academic success. With effective instruction, the predictive value of socio-economic status is diminished.

In case you were wondering how this performance translates to reading performance. Below is a graph of Terra Nova Reading scores for the fifth grade class of 2007 (which received only three years of DI instruction) with the performance of three cohorts of seventh graders who did not receive any DI instruction. Terra Nova is a nationally normed standardized test.

As you can see from the graph, the fifth graders who received three years of DI outperformed the seventh graders who did not.

Continued in third post.

April 10, 2008

Gering Public Schools: The School District to Watch

Keep your eyes on the Gering Public Schools district in Northwest Nebraska.

Gering, a smallish 2,000 student district with 30% ethnic minorities (mostly non-ELL Hispanics) and with 43% of students on free/reduced lunch, was an underperforming school district back in 2002 when new district administrator, Don Hague, decided to do something drastic (and smart) with the district's new Reading First grant. Hague didn't just adopt a new research-based basal reading program for Gering, he didn't even adopt a research-validated reading program, he went whole hog and adopted a district-wide research-validated whole-school reform. Like I said, a smart move because most administrators would only have done the bare minimum needed, having the least amount of changes, to give the appearance they're doing something to solve the problem. Real reform requires more serious effort.

Hague also wisely chose, Direct Instruction (DI), as his whole-school reform and implemented the reform with the assistance of the National Institute of Direct Instruction (NIFDI). It's a wise choice because DI has a proven track record of success in grades K-3 along with some longitudinal data for grades 4 and 5. By adopting the whole-school version of DI, there is strong likelihood that Gering students will acquire all the fundamental skills they need for learning content area sbject matter starting in sixth grade.

After three years, Gering is already starting to see results.

Before implementing DI, there was a 23 point gap between Hispanic and white students in fluency benchmarks in second grade in the Gering. Last year, not only was the gap closed, a greater percentage of Hispanics met the fluency benchmark than did white students -- a -2% gap.

The district, not content waiting for the elementary students receiving DI to reach junior high, adopted the remedial DI reading program for its junior high students to improve their chances of succeeding in high school. After one year of remediation, Terra Nova scores went from a 39% pass rate to a 55% pass rate. That's an effect size of about 0.4 standard deviation (σ). Let's put that in perspective.

A 0.4σ improvement is about 60% better than the gains made in the Project Star class-size reduction study. And, in order to get similar gains via improving teacher effectiveness, you'd have to replace teachers with an average effectiveness at the 50th percentile with super teachers performing at the 94th percentile.

But, the best part about the intervention in Gering is that it will be producing a great deal of longitudinal data. Gering has four elementary schools. Each elementary school, and only those schools, feeds into a single junior high school. That junior high school, and only that junior high school, feeds into the sole high school. Hopefully, Gering will stick with the intervention for the next ten years or so, so the longitudinal data can be collected.

I've been in contact with Gering and NIFDI trying to get some additional data for the intervention and plan on reporting any results and answers I can get. In the meantime, take a look at the short documentary on Gering that was produced. Note in particular, the interviews with the teachers and the responses they give, especially with respect to the children's reactions to the intervention.

Continued in Second Post.

April 9, 2008

Carnival of Education

The 166th version of the Carnival of Education is up.

Lots of good posts to read. Go check it out.

April 7, 2008

Your Pet Reform is Suckier Than You Think

We are not happy with student performance in the U.S. Which is to say, we are not happy with our schools' ability to educate.

So back in 1965 we, as a nation, passed the Elementary and Secondary Education Act (ESEA) to improve the state of education. The ESEA established the Department of education to distribute funding to schools and school districts with a high percentage of students from low-income families. These became known as Title I schools. The ESEA lacked a real accountability provision and so schools were not accountable for achieving any results with the federal funds. Not unsurprisingly, those results were not forthcoming.

In 2001 we decided to ameliorate that deficiency by reauthorizing ESEA to include an accountability provision. We renamed ESEA to No Child Left Behind (NCLB) and increased funding to cover the new accountability provisions (i.e., standards setting and yearly testing in grades 3-8 and 11 in math and reading (and now science)).

Let's put the problem in perspective.

This graph represents our baseline student performance back in 2001. Let's set our goal such that half the students met the goal back in 2001. (For those of you familiar with NAEP, this goal falls between the proficient and basic levels of performance). The blue shaded area under the curve represents the percentage of students who met the standard.

(Another way of looking at this is to pretend that all students took a standardized test back in 2001, we normed the test to get a normal distribution of performance with 50% of the students passing and 50% failing, then we froze that test. Subsequent improvement would result in more than 50% of students passing the test. If you think this is a Lake Woebegone effect, you don't understand the effect.)

The goal of NCLB was to have virtually all students meet this standard. This would have required an improvement of over 2 standard deviations (σ). And, thus, the chase began trying to find an intervention that would improve student performance, hopefully by at least 2σ. Seven years later this remains the national focus, at least among those who haven't given up yet.

But, here's the problem. Most people don't seem to understand how much improvement is actually needed to comply with NCLB. How much is a 2σ improvement? A lot more than most people think.

In fact, some, perhaps many, think that a real 2σ improvement is impossible. Let's accept that premise for the time being and set our goal a bit lower. What would it take for a typical Title I classroom to perform as well as an average classroom? Here is the performance of a typical Title I classroom.

In the typical Title I classroom, only 20% (blue shaded area) of the students meet the standard. In order to have this Title I classroom perform as well as the typical classroom another 30% (yellow shaded area) of students would have to meet the standards. This represents an improvement of about 0.84σ, otherwise known as a large effect size.

What does this mean? It means that if we improved all schools across the board by about 0.84σ, then only about 20% of students nationwide wouldn't meet the standards. In other words, 80% of students would meet the standards. Now a 80% pass rate doesn't comply with NCLB; but, let me let you in on a little secret. States have found ways to cheat by lowering their standards and their cut scores such that if we loosened the NCLB requirements say to a 90% to 93% pass rate, we'd be within spitting distance of meeting NCLB requirements. But, the problem remains of how to squeeze out about 0.84σ of real school improvement in the first place.

There is, of course, no shortage of opinions as to how to improve schools. It seems that everyone has their pet reform that they think is going to be some sort of magic educational bullet. The problem is that most of these educational bullets are being shot out of pop guns.

For example, let's take the favored reform of most edu-commentators: class-size reduction. The theory is that by reducing class sizes down to ridiculously small (13 to 17 students per class) and ridiculously expensive levels than student achievement will improve. In fact, student achievement will tend to improve, just not very much. Certainly less than these commentators think. In most reasonable rigorous experiments, such as Project Star, gains from class size reduction were found to be almost 0.25σ. Not much. Here's a graph to show you how little of an improvement that really is.

See that red sliver? That's the amount of improvement you can expect to see from class-size reduction. Not much. By reducing our typical Title I classroom down to Project Star levels we can expect to raise student achievement by a whopping 8%, from a 20% to 28%. Break out the champagne, kids, it's time to celebrate!

We have a name for interventions that achieve effect sizes of less than 0.25σ -- not educationally significant. This is a realization that in the real world, such interventions will likely have little or no effect in student achievement. For example, Project Star was plagued with many methodological flaws that would serve to inflate the already small effect size it achieved under experimental conditions.)

But let's not pick on class-size reduction reforms too much. The sad fact is that about 95% of all education reforms fail to achieve even the small effect sizes achieved in Project Star. This means that most education reforms fail to achieve educationally significant effects. Now go back and look at the graph again. See the read sliver which represents the smallest educationally significant effect size (0.25σ)? The red sliver for almost all educational reforms is even smaller than that red sliver shown on the graph. Wrap your head around that. And, make sure you keep that in mind the next time you tout your pet education reform. It sucks. Now you know it; stop pretending that you don't.

Now let's briefly leave the world of reality and entering the realm of fantasy. A fantasy world where statistical correlation is the same as causation. This is the land of Kozol. This is where everyone who thinks that raising student socio-economic status (SES )will lead to student achievement. It's also the land where those who think that improving teacher effectiveness is the be-all-and-end-all of education reform. It isn't. Statistical correlations aren't reality, no matter how much you want them to be.

Let's pretend for the sake of argument there's some magic potion that could increase teacher effectiveness by 2σ. To put this in perspective. This would raise the effectiveness rating of an average teacher (50%) to a super teacher (90% effectiveness rating) and would raise a 25% teacher to a 75% teacher. Using data from this study, you can see what kind of improvement we might expect from these new magical super teachers.

See the slightly larger red sliver? That sliver represents a 0.35σ effect size. An effect size that is educationally significant. By, an effect size that still misses the goal by 19 percentage points, i.e., 41% of students failed to improve sufficiently in response to the super teachers. Achieving a 2σ increase in teacher effectiveness is a pipe dream. Even achieving a 1σ improvement is probably a pipe dream, especially when you consider that the study that looked into this question failed to find a correlation between any of the typical things (credentials, experience, etc.) thought to be associated with teacher effectiveness and increased student performance. With only a 1σ improvement, however, the effect size (about 0.26σ) becomes educationally insignificant.

Bear in mind that many pet reforms relate to increasing teacher effectiveness. Paying teachers more is an attempt to increase teacher effectiveness. Raising teacher prestige is an attempt to increase teacher effectiveness. requiring greater credentials is an attempt to increase teacher effectiveness.

Which finally brings us to the reason why improving student achievement by 0.84σ (a large effect size) is within the realm of possibility. That would be the little heard of Project Follow Through, the largest education experiment in U.S. education history in which one intervention, the Direct Instruction (DI) intervention actually achieved gains of at least 0.84σ, often more.

Notice the large red slice and the lack of a yellow slice indicating no shortfall. If only your pet education reform worked as well as this one.

Update: Teach Effectively has a related post and a link to an analysis of some of the few interventions, including effect sizes, that work. Go check them out.

Update II: Brett from the DeHavilland Blog has outed himself as a closet Vanilla Ice fan. I'm sure this was a difficult and painful decision for Brett and his family. The world needs more true heroes like Brett who aren't afraid to speak truth to power.

April 4, 2008

Reid Lyon Smacks Down Bob Slavin

Ed News is running a weeklong series of lengthy interviews (part one, part two, part three) with Reid Lyon, former chief of NICHD, on the Reading First "scandal." Lyon spends a great deal of time laying the smackdown on Bob Slavin who initially instigated the investigation and concludes.

To repeat myself, I have been disappointed that no respected education researcher, policy researcher, or Department of Education entity has fully dissected Slavin's allegations, identified all the evidence he used to support each allegation, and then examined the strength of that evidence in supporting the accusations. No doubt, this is tedious work but it must be done and preferably by a number of independent individuals and entities. As I am carrying out my own research on the veracity of the "evidence" I have been surprised at the amount of scandal mongering based, as best as I can identify, on back-fence gossip, and hearsay. I am hopeful that those who have generated the accusations against individuals and the Reading First program will step forward and provide objective evidence that the allegations are valid.I cannot find any evidence of illegal or unethical behavior in the massive amount of emails between the Reading First office and state and district Reading First officials. Nor can I find any evidence of this in the emails I am reviewing between the TACs and publishers/vendors, and Reading First state and district officials. To me, identifying the actual evidence of any wrongdoing is essential if we are going to improve a program beyond putting in place safeguards against the perception of conflicts of interest – don't get me wrong - these safeguards are critical. Evidence needs to be provided that identifies the instances when illegalities were actually committed.And when this evidence is presented, it needs to be reported under oath. In short, the record must be set straight.

I don't think this is necessary. The OIG reports were famously silent as to any of Slavin's accusations and no actual evidence was ever adduced which supported his accusations.

Finally, there appears to be an internet first, a comment by Zig Engelmann (assuming it's really him:)

Lyon’s position about Chris Doherty is solid. Chris was butchered and broiled not because he did anything questionable, unethical, or self-serving but because he was a convenient target. Chris is a man of high character; he’s very smart; and he’s a doer, not a political goon. I found Lyon’s coverage both interesting and consistent with what facts I know about this neo-McCarthy witch hunt that served the hunters, but at a serious cost to Chris and the kids he was trying very hard to serve.

April 3, 2008

Slate on NCLB Contd

(continued from part I)

Now we get to part of the article in which Ryan tells us how to fix NCLB, otherwise known as pure fantasy. He has three main ideas for "fixing this mess," and by ideas I mean Ryan's opinion utterly lacking any empirical support.

1. National Standards.

It's time to create national standards and tests in at least reading, math, science, and social studies/history.
For some reason, national standards have become synonymous with "perfect standards." As if anything that goes through the political process on the national level with all the competing interest groups get a chance to influence the process in a way that benefits their particular way of educating. Now recall that most educator's "particular way of educating" is not only ineffective, but often detrimental to at-risk students. To put it mildly, there is absolutely no reason to believe that National Standards will be any better than the existing state standards, which are on average pretty awful.

2. Administer fewer tests

National tests should be given less often, perhaps in only fourth, eighth, and 11th grades.
Because it's always easier to identify and diagnose problems with less feedback, especially when you don't know what you're doing in the first place. It's a trial and error approach in which each trial last four years before we determine whether there's been an error.

3. Rank schools; don't prescribe punishments

Ranking systems aren't perfect, but using multiple criteria to rank schools should provide a much clearer and fuller picture of school quality. States can then decide on their own how they want to sanction or assist the low-performing

I don't see how ranking schools is any different of better than what we're doing now, unless there's some benefit to even less accountability--because that's exactly what everything proposed in this paragraph actually leads to.

And, before NCLB we used to let the states police themselves and they completely failed to do so. So good luck with that.

Then Ryan tells us:

If and when NCLB is fixed, the next president should concentrate on two key
issues: teachers and preschool.

By concentrate, Ryan means "give more money to." And, we all know how well that's worked in the past. There is little correlation between teacher compensation and teacher performance. Most of the existing problems are unrelated to compensation. Increasing compensation won't help poor teaching practices and bad curricular decisions.

Universal preschool is the next magical panacea. But I can't for the life of me figure out hoe adding a year of preschool is going to help at-risk kids if the same clowns that run kindergarten run it. If they can't teach them in kindergarten, which they can't, what makes you think they'll be able to teach them in preschool when they are even younger and more difficult to teach.

NCLB is far from perfect; but there's no reason to think that Ryan's suggestions will improve anything.

April 2, 2008

Slate on NCLB

It's difficult to ascertain how much of this Slate article on fixing education policy is attributable to Slate's editors or the nominal author, Jim Ryan. Certainly, the misleading title "fixing it: Repairing some of the worst Bush administration screw-ups." can be blamed on the Slate editors. The Bush administration education policy is basically NCLB which is a bipartisan piece of legislation largely drafted by Democrats and left of center groups. So, classifying it as a "Bush screw-up" is a tad misleading. In fact, classifying it as a "screw-up" is also misleading. At worst, NCLB has perhaps been ineffective, but there is no evidence that it has made the already bad situation any worse.

Nonetheless, the body of the article is generally awful and blame can squarely be placed on its author for that. Let's dissect.

Most of the problems caused by the act stem from its ridiculous test-and-punish regime.

Actually, it's more like like a reward with federal largess, test to make sure federal funds are being used effectively, and punish schools that are not using the funds effectively regime. And, of course, it's a voluntary regime that states can opt out of if they so desire.

Specifically, the act promotes the heavy use and misuse not just of tests, but of stupid tests. This isn't a reason to abandon all testing; it is a reason, however, to come up with better tests and better ways to use those tests to judge schools.

The act does not promote the heavy use and misuse of stupid tests. Specifically, NCLB gives the states plenty of leeway to craft their own "challenging academic content standards" that must be "in academic subjects that--(I) specify what children are expected to know and be able to do; (II) contain coherent and rigorous content; and (III) encourage the teaching of advanced skills." (Section 1111(b)(1)(D)(i))) and "challenging student academic achievement standards" which must be "(I) [] aligned with the State’s academic content standards; (II) describe two levels of high achievement (proficient and advanced) that determine how well children are mastering the material in the State academic content standards; and (III) describe a third level of achievement (basic) to provide complete information about the progress of the lower-achieving children toward mastering the proficient and advanced levels of achievement" (Section 1111(b)(1)(D)(ii))). States must also enact an Accountability System (defined in 1111(b)(2)) that assures the state is making adequate yearly progress "toward enabling all public elementary school and secondary school students to meet the State’s student academic achievement standards, while working toward the goal of narrowing the achievement gaps in the State."

I'm not sure how anyone could read the statutory language as mandating the heavy use and misuse of stupid tests. But there we have it. Certainly, some states have crafted stupid standards and stupid tests, but NCLB didn't exactly mandate that they do any such thing. NCLB gives states plenty of leeway to fix their education problems in the best way they see fit or, as it turns out, avoid fixing their education problems by delaying the day of reckoning and/or setting low standards. But asking for more from NCLB would be even further intrusive which is something Ryan is definitely against already.

Current test results don't tell us all we need to know about schools. Far from it. Students are tested in reading and math and a little in science. Reading, math, and science are important, but so are social studies, history, literature, geography, art, and music. Instead of telling us how schools are doing in these other subjects, NCLB is turning them into endangered species by pushing schools—especially those that are struggling—to downplay if not ignore subjects not tested.

It is perhaps an exaggeration to analogize the situation to not worrying about the leaky faucets while the house is on fire, but not by much. Fixing reading and math should be a top priority, you kinda need those two things in order to learn the other things we also think are important, such as, history, geography, science, art, and music. In fact, there's good reason to believe that in order to fix reading comprehension problems you're going to go a long way toward fixing the content area subjects anyway. Moreover, if you want to get more information on student abilities in these non-tested areas, you're probably going to have to increase the number of tests to include these subjects, something Ryan is against.

Many tests that are given further narrow the focus of education by relying on multiple-choice questions that reward memorization and regurgitation rather than analytical and creative thinking.

Any poorly designed test can be used to "reward memorization and regurgitation." However, a well-designed multiple choice test will yield almost as good feedback as a well-designed fill-in-the-blanks or essay style test. The compromise is that the multiple choice tests are cheaper and easier to administer and are easier to grade. And, therefore the results can be gotten back to schools more quickly so that schools might act on the feedback in a more timely manner.

The second problem is that looking at just a sheet of test scores is a lousy way to judge school quality. Standardized test results tend to track socioeconomic status. As a teacher once remarked, the most accurate prediction you can make based on a student's test score is her parents' income. Teachers and schools with middle-class kids will invariably look better than those with poor kids if the only measure is how many students in a particular year pass a test.

This is why NCLB mandates disaggregating the data for various at-risk subgroups, so that more affluent school districts can't hide their failures. It's not a perfect system, but testing subgroups does catch many schools. For example, in Pennsylvania 446 out of 501 school districts have enough economically disadvantaged students to trigger the reporting requirements of NCLB. Similarly, 128 districts have a sufficient number of black student to trigger the requirement even though over 2/3rds of black students are concentrated in just 10 school districts.

And, not to put too fine a point on it, but this un-named teacher is wrong. Parental education levels and student IQ levels are better predictors than SES.

What we can't tell from scores alone, because they don't tell us where students started or how much they progressed over the year, is the value that a particular teacher or school has added to a student's education.

This only really matters if the schools are doing a lousy job getting kids up to speed in the first place. Schools have nearly four years to get kids to read and do basic math at a third grade level which is about twice the amount of time, even for low-performers, it should take if the school is performing with reasonable competence. After that, the school still has to make a year's worth of gains every school year or else students will fall behind. In other words, these "value-added" proposals only really matter at all today because many schools continue to do a lousy job in K-3 before the first NCLB test is even administered.

The third and most fundamental problem has to do with perverse incentives. Schools must show annual improvements on test scores or face increasingly severe sanctions and the stigma of being labeled as failing. NCLB couples this punitive scheme with utter laxity regarding the standards and tests themselves. States get to develop their own standards, create their own tests, and set their own passing rates. Imagine if the EPA told the auto industry it would be fined heavily for polluting too much but let automakers decide for themselves what counts as "too much" pollution. That's basically how NCLB works.

Again, I note the irony that Ryan started the article by complaining that NCLB is too intrusive, yet all of his complaints imply a solution that is even more intrusive. I also note the irony that NCLB allows states to set their own standards and testing instruments (as discussed above) and Ryan is complaining that its a bad thing that schools can't even meet these bogus standards. That should indicate the severity of the the underlying problem (schools don't know how to teach at-risk kids) really is. Imagine how punitive the measures would be if we held them to real standards. It's like blaming your scale which underweighs you by ten pounds for your being overweight by fifty pounds.

NCLB, despite lofty rhetoric to the contrary, is not about equalizing opportunities in poor and rich, city and suburban schools; it's about making sure kids can learn some of the basics. No less, for sure, but also no more.

You have to start somewhere right. Makes little sense to test a student on his ability to interpret a Shakespeare sonnet if that student can't read the thing in the first place.

OK, this post is getting too long. I'll save Ryan's equally-dopey proposed solutions for another post.

April 1, 2008

Back from Hiatus

Took a vacation in Vermont to do a little skiing. Spring break came early this year in our district so we took advantage of it.

Looks like I didn't miss too much. That's the one good thing about education, nothing much happens that's important. Which is not to say that nothing happens--just that most of what happens isn't important.

The only thing of consequence I could find is that E.D. Hirsch turned 80 last week. Hirsch is apparently still going strong as is evident by this American Educator article on state standards.

Some day we might put the content back in education. And one day we may learn how to get that content into the heads of at-risk students. But today is not yet that day.

Instead we get the umpteenth article on class size reduction and other educational panaceas.