Correlation and Covariance are not the proof for Causal Relationship

Causal Relationship with Bayesian Networks

Whenever I Clap twice, A.C Milan scores a goal, there must be some correlation!

Alex Aman
DataDrivenInvestor
Published in
4 min readDec 10, 2021

--

Pixabay License

We all know the classic saying that “Correlation does not imply causation”.In the world of Data Science and Machine Learning, it is very important to find the exact variables in Big data that are causing our Target Variable, For example, we know Sales can be caused by Price, Weather, Discounts, geographic locations, etc, but not by Lunar cycle (Even if the data shows strong correlations), So there has to be a better way to find CausalRelationship, and hence we can use Bayesian Networks,

Before going into Bayesian Networks, one has to be clear with the Concept of Conditional Independence, We can discuss with this quick example-

We see a kid's height and a kid’s knowledge, is growing at the same time, but it does not mean that there is a relationship between Height and Knowledge, but there is another Variable Age, So as Age of a kid increases, Height of the kid increases and with Age of the kid increases also the knowledge of kid increases, So there is the relationship between Age- Height and Age -Knowledge not Height-Knowledge,

So in Bayesian Networks, we create one-way Chains describing which Variable is causing what Variable, In the above example Age is a Parent node to Height and Knowledge, so Height or Knowledge is dependent on Age, not on each other.

Bayesian networks calculate probabilities by finding patterns (One-way Chains) inside the data and calculating that how much another variable change if we change its parent variable (or parent to parent variable)

These one-way chains are called Directed Acyclic Graphs and they create Graphs as shown below, with these principles below that in fact participates in solving the problem of Correlation is not Causation,

-Each variable can affect and cause its predecessors but predecessors cannot affect its parent
-There cannot be a loop in the network

Directed Acyclic Graphs (image by author)

The image above is a Directed Acyclic Graphs (One-way Chains) that are created with the R package bnlearn(research notes), where it performs Structure learning and find patterns inside the Data to create this DAG,

Focus on MD, CR, and Budget in the image for a causal relationship,
You can read this graph following the Chains and you will find :

1. MD is affecting CR through MD → PPC → CR and also Directly with MD → CR
2. There are also other factors that are affecting CR through PPC, namely Budget → PPC → CR

In bnlearn and DAG’s we can also impute some of our knowledge by Whitelisting or Blacklisting certain chains that we think do not have a relationship according to the business, You can find examples in the notebook in my Github repository.

Thus Bayesian Networks creates Conditional probabilities on the basis of these chains, and studies all the parents of a node, and create probabilities that if change the certain variable is already changed, how much another variable will change based on the relationship it finds inside DAG,

You can find complete code on how to perform all these steps in Python and R at the bottom of this article,

Here is an example of a relationship between variable, how Reducing MD will have effect on CR, calculated by Conditional probabilities based on DAG’s

Relationship between Reducing MD and CR with Bayesian Networks

Code to create this graph is at the bottom of this article though we can read this Graph Like this :

On reducing MD, the probability of CR going low is nearly 60%, So the relationship says that MD is affecting CR that if MD on a product is low (MD is discount variable) the chances of having Low CR (Conversion Rate) for that product is 60%, So reducing MD, there is 60% probability that CR will be low , and we know DAG’s assume Conditional Independence and do not overlap among different variable that is not in the same chain, So the problem we have during Correlation is not causation is eliminated with Bayesian Networks,

Now you have some basic foundation of using Bayesian networks for your business, though if you are new to Bayesian Nets, you may need to go through some topics for having a deeper understanding of the Mathematical and statistical part of Bayesian Networks, You can start with these following topics:
1 Conditional Probability
2 Hill Climbing Algorithm
3 Markov Blanket Size
4 Bootstrapping
5 Arcs Strenght

Here is the Complete code to perform Bayesian Network analysis:

--

--