• R with Power BI: Import, Transform, Visualize and Share

Many a time we need fast answers with limited resources. We need to prove something for which don’t have enough evidence. Or as the case may be, we don’t have enough data. Under such circumstances it’s easier to fall into the trap of interpreting correlation as causation. In order to understand the difference between causation and correlation it’s important to have a look at how completely unrelated data points can also show spurious correlation.

Once you do that, you would understand that two very unrelated data sets like, Per capita cheese consumption and number of people who died by getting entangled in their bed sheets can be correlated. So can be number of suicides by strangling and suffocation and US spending on science, space and technology. So, it becomes obvious through these examples that one of these factors can’t be the cause for another- hence, even though their is a visible correlation , a causation in very unlikely.

How do we establish a cause-effect (causal) relationship?

Once we understand that correlation and causation are different we can explore, what criteria should we set in order to establish a causal relationship. Generally, there are three criteria that must be satisfied before we can say that there is an evidence for a causal relationship:

Temporal Precedence: First, you have to be able to show that your cause happened before your effect.

Covariation of the Cause and Effect: If you observe that whenever X is present, Y is also present, and whenever X is absent, Y is too, then you have demonstrated that there is a relationship between X and Y. We can do it using correlation coefficient.

Rule out plausible Alternative Explanations Just because you show there’s a relationship doesn’t mean it’s a causal one. It’s possible that there is some other variable or factor that is causing the outcome. This is sometimes referred to as the “third variable” or “missing variable” problem and it’s at the heart of the issue of internal validity.

So, in order to argue that there is internal validity — and that there’s a causal relationship — we have to “rule out” the plausible alternative explanations.