Friday, January 16, 2009

Correlation, Causation, and non-sense

I have seen this argument a lot as of late. It normally runs the standard course of correlation does not imply causation with a good implication of just because there is a correlation does not imply a relationship.

Unfortunately it isn't really true.

It is true that a correlation between two variables, say for example the weather and autism rates, does not imply that one causes the other. But it does not rule it out either.

So making a statement about implying a lack of causation is as wrong as saying that there is causal relationship - there simply is not enough data from a correlation to say either way. Actually, for causation there first has to be a correlation, but I digress. In reality the best you can do is make a neutral statement.

However, a correlation does say that these two variables have a relationship where they change in tandem, ie they are not independent. This means that the variable are related in some way or there is some other factor to which they both respond.

So at that point you either a) have bad data b) have a spurious correlation or c) there is a relationship between the variables.

For a better explanation of this I suggest reading the below

http://en.wikipedia.org/wiki/Correlation_does_not_imply_causation

So going back to the ice cream and murder from the post that I linked to, the correlation between the two apparently unrelated items does show that they both can sometimes be related to some outside factor, in this case the weather can effect both. So while ice cream sales and murder don't cause one another they do have a causal factor in common, the weather. Not that this implies that the weather is the only factor driving both, it simply implies that it is a factor.

Or, while we are talking about the weather, if you have studies showing a correlation between autism rates and the weather while it does not follow that the weather causes autism or cause autism to be diagnosed, it does follow that there is some common factor between the two (assuming that there is not some problem with the data or the study). Which implies that there is something related to the environment that can effect autism. And for a disorder that is taken to be strictly genetic that would be a bit of pickle.

But I guess that is the real issue anyway. As a parent, if I see a therapy change the behavior of my child with autism, authors like the above would have me doubt that the therapy worked. Or if I change my child's diet and see drastic improvements in attention and eye contact, that might have just happened anyway with out the diet change.

After all, just because B follows A doesn't mean that there is a relationship. Taken to the extreme, just because I turn the door handle on a door and the door opens, it does not follow that me turning the handle opened the door, right?

Only it turns out that once you start down that road you can't really prove a relationship between any two events at all. I push the key on the keyboard and a letter appears on the screen of the computer - what evidence do I have that the two have any real correlation let alone a causal one?

It is a good thing that people have the ability to reason. After all, you might have identical twin girls who both get a shot on the same day, both develop a fever within the next 24 hours and spend the next month sick and mostly non-responsive. And after the end of the month you give them another shot because everyone knows they are perfectly safe and the doctor is saying that you should do it. And then you spend the next six months wondering if they can still hear when they answered to their names before but now don't respond to any sounds. Then the diagnosis.

Could it have happened anyway? Of course. Was there any relationship between the shots and the regression? There is no way to know for certain. Would it have happened regardless of the shots? Again, that isn't knowable. But, given the fact that they are genetically identical yet separate people and they both had the same reaction and the same time (illness) it makes it less like likely to be just chance and more likely to be a potential factor in the equation, just like the hot weather helps to sell ice cream.

But I am guessing this will be just another one of those meaningless correlations for some people. Just because it happened to one, oops, I mean two, people the same way at the same time means nothing.

No comments:

Post a Comment