The Regression Effect

Excerpts from "SticiGui: Statistics Tools for Internet and Classroom Instruction with a Graphical User Interface," by Philip B. Stark, UC Berkeley Statistics. (SticiGui site)

Consider the IQs of a large group of married couples. Essentially by definition, the average IQ score is 100. The SD of IQ is about 15 points. Let us suppose that for this group, the correlation between the IQs of spouses is 0.7---"smarter" women tend to marry "smarter" men, and vice versa.

Consider a married woman whose IQ is 150 (genius level). What is your best guess of her husband's IQ?

We'll predict his IQ by regression: her IQ is 150, which is 50 points above average. 50 points is

(3 1/3)×15points = 3 1/3 SD,

so we would estimate the husband's IQ to be r×3 1/3 SD = 0.7×3 1/3 SD above average, or about 2 1/3 SD above average. 2 1/3 SD is 35 points, so we expect the husband's IQ to be about 135, not nearly as "smart" as she is.

Now let's predict the IQ of the wife of a man whose IQ is 135. His IQ is 2 1/3 SD above average, so we expect her IQ to be 0.7×2 1/3 SD above average. That's about 1.63 SD or 1.63×15 = 24½ points above average, or 124½, not as "smart" as he is. How can this be consistent?

First of all, I assure you that the algebra is correct. The phenomenon you have just seen is quite general. It's called the regression effect. It is caused by the same thing that makes the slope of the regression line smaller in magnitude than the slope of the SD line: for a football-shaped scatterplot with positive r, in a vertical slice containing above-average values of X, most of the Y coordinates will be below the SD line. In a vertical slice containing below-average values of X, most of the Y coordinates will be above the SD line. For a football-shaped scatterplot with negative r, in a vertical slice for above-average values of X, most of the Y coordinates will be above the SD line, and in a vertical slice for below-average values of X, most of the Y coordinates will be below the SD line.

In most test-retest situations, individuals who are much higher than average on one test tend to be above average, but closer to average, on the other test. (In the example above, "individuals" are couples, the first test is the IQ of one spouse, and the second test is the IQ of the other.) Similarly, individuals who are much lower than average in one variable tend to be closer to average in the other (but still below average). Those who perform best usually do so with a combination of skill (which will be present in the re-test) and exceptional luck (which will likely not be so good in a re-test). Similarly, those who perform worst usually do so with a combination of lack of skill (which still won't be present in a re-test) and bad luck (which is likely to be better in a re-test).

If the scatterplot is football-shaped, many more individuals are near the mean than "in the tails." A particularly high score could have come from someone with an even higher "true" ability, but who had bad luck, or someone with a lower "true" ability who had good luck. Because more individuals are near average, the second case is more likely; when the second case occurs, on a retest, the individual's luck is just as likely to be bad as good, so the individual's second score will tend to be lower. The same argument applies, mutatis mutandis, to the case of a particularly low score on the first test.

Failing to account for the regression effect, concluding that something must cause the difference in scores, is called the regression fallacy. The regression fallacy sometimes leads to amusing mental gymnastics and speculation, but can also be pernicious.

Example: Pilot training in the Israeli Airforce. (From Tversky and Kahnemann, 1974. Judgement under Uncertainty: Heuristics and Biases, Science,185, pp1124-1131.) A study was done in the Israeli airforce on the effectiveness of punishment and reward on flight training. Some students were praised after particularly good landings, and others were reprimanded after particularly bad ones. It was observed that those who were praised usually did worse on their next landing, while those who were reprimanded usually did better on their next landing. The obvious conclusion is that reward hurts, and punishment helps. How might this be an instance of the regression fallacy?

Answer: After a particularly bad landing, one would expect the next to be closer to average, whether or not the student is reprimanded. Similarly, after a particularly good landing, one would expect the next to be closer to average, whether or not the student is praised.