Simpson's Paradox - an introduction

Imagine you have a new treatment for a serious disease that you want to test. You find some male patients and give some of them the new treatment and some without:

Lived Died Recovery %
New treatment 70 30 70%
Old treatment 180 120 60%

It looks like it is increasing recovery rate by 10%.

So we test it on a group of women:

Lived Died Recovery %
New treatment 90 210 30%
Old treatment 20 80 20%

Again, it appears that it is increasing recovery rate by 10%.

But what happens when we look at the total numbers for men and women combined? You can do the math yourself, here it is:

Lived Died Recovery % (Total)
New treatment 160 240 40% 400
Old treatment 200 200 50% 400

Suddenly we see that the new treatment is decreasing recovery rate by 10%, not increasing! Seems impossible, no?

There is no math error above, the problem is due to an effect called the Simpsons Paradox.

The Simpson's Paradox happens when we have a confounding variable which causes the groups in our split to be flipped in their size differences, as you can see above. We tested far more men with the old treatment, yet we tested far more women with the new treatment. This effect can actually happen (and has happened aplenty!) in real world examples. More info on the Wikipedia page.