[ExI] simpson's paradox - bill w

William Flynn Wallace foozler83 at gmail.com
Sat Jan 25 15:46:24 UTC 2020

```This one is new to me.

During a faculty meeting, a group of 9th grade teachers decided they needed
to further understand what the optimal duration of study is for students to
achieve satisfactory results. So, they decided to gather the approximate
number of hours students were studying, and then compare to the student’s
test scores.

Mr. Simpson convinced the faculty that more data means better results, and
so all of the teachers integrated their cross-course data for the analysis.

The results were astounding. To everyone’s confusion, the less a student
studied, the higher they tend to score on tests.

In fact, the coefficient associated with this correlation was -0.7981, a
strongly negative relationship.

Should they be encouraging their students to study less? How in the world
could data be backing up such a claim? Surely something was missing.

After discussing the results, the teachers agreed they should consult the
school’s statistician, Mrs. Paradox. After Mr. Simpson explained to Mrs.
analyze each course’s data individually.

So, they went ahead and analyzed Phys. Ed. and proceeded to have their
minds blown.

A correlation of 0.6353! How in the statistical universe was this even
possible?

phenomenon where a seemingly strong relationship reverses or disappears
when introduced to a third confounding variable.*

She convinced Mr. Simpson to plot all of the data once again, but then
color-code each course separately to distinguish them from one another.

After doing so, Mr. Simpson and the 9th grade faculty concluded that the
relationship was indeed positive, and that the more hours a student
studied, the higher the grade tends to be.

Including the course of study in the analysis completely reversed the
relationship.
