Previous: Correlation and causation
In this post, I’ll explain three of the most valuable tools for inference that arise naturally from causal modeling.
Screening off via causal intermediary
Screening off via common cause
Suppose that the rain causes the sidewalk to get wet, and the sidewalk getting wet causes you to slip and break your elbow.
This means that if you know that it’s raining, then you know that a broken elbow is more likely. But if you also know that the sidewalk is wet, then learning whether or not it is raining no longer makes a broken elbow more likely. After all, the rain is only a useful piece of information for predicting broken elbows insofar as it allows you to infer sidewalk-wetness.
In other words, the information about sidewalk-wetness screens off the information about whether or not it is raining with respect to broken elbows. In particular, sidewalk-wetness screens off rain because it is a causal intermediary to broken elbows.
Suppose that being wealthy causes you to eat more nutritious food, and being wealthy also causes you to own fancy cars.
This means that if you see somebody in a fancy car, you know it is more likely that they eat nutritious food. But if you already knew that they were wealthy, then knowing that their car is fancy tells you no more about the nutritiousness of their diet. After all, the fanciness of the car is only a useful piece of information for predicting nutritious diets insofar as it allows you to infer wealth.
In other words, wealth screens off ownership of fancy cars with respect to nutrition. In particular, wealth screens off ownership of fancy cars because it is a common cause of nutrition and fancy car owning.
Suppose that being really intelligent causes you to get on television, and being really attractive causes you to get on television, but attractiveness and intelligence are not directly causally related.
This means that in the general population, you don’t learn anything about somebody’s intelligence by assessing their attractiveness. But if you know that they are on television, then you do learn something about their intelligence by assessing their attractiveness.
In particular, if you know that somebody is on television, and then you learn that they are attractive, then it becomes less likely that they intelligent than it was before you learned this.
We say that in this scenario attractiveness explains away intelligence, given the knowledge that they are on television.
I want to introduce some notation that will allow us to really compactly describe these types of effects and visualize them clearly.
We’ll depict an ‘observed variable’ in a causal diagram as follows:
This diagram says that A causes B, B causes C, and the value of B is known.
In addition, we talked about the value of one variable telling you something about the value of another variable, given some information about other variables. For this we use the language of dependence.
To say, for example, that A and B are independent given C, we write:
(A ⫫ B) | C
And to say that A and B are dependent given C, we just write:
~(A ⫫ B) | C
With this notation, we can summarize everything I said above with the following diagram:
In words, the first row expresses dependent variables that become independent when conditioning on causal intermediaries. B screens off A from C as a causal intermediary.
The second expresses dependent variables that become independent when conditioning on common causes. B screens off A from C as a common cause.
And the third row expresses independent variables that become dependent when conditioning on common effects. A explains away C, given B.
Repeated application of these three rules allows you to determine dependencies in complicated causal diagrams. Let’s say that somebody gives you the following diagram:
First they ask you if E and F are going to be correlated.
We can answer this just by tracing causal paths through the diagram. If we look at all connected triples on paths leading from E to F and find that there is dependence between the end variables in each triple, then we know that E and F are dependent.
The path ECA is a causal chain, and C is not observed, so E and A are dependent along this path. Next, the path CAD is a common cause path, and the common cause (A) is not observed, thus retaining dependence again along the path. And finally, the path ADF is a causal chain with D unobserved, so A and F are dependent along the path.
So E and F are dependent.
Now your questioner tell you the value of D, and re-asks you if E and F are dependent.
Now dependence still exists along the paths ECA and CAD, but the path ADF breaks the dependence. This follows from the rule in row 1: D is observed, so A is screened off from F. Since A is screened off, E is as well. This means that E and F are now independent.
Suppose they asked you if E and B were dependent before telling you the value of D. In this case, the dependence travels along ECA, and along CAD, but is broken along ADB by observation of D. This follows from our rule in row 3.
And if they asked you if E and B were dependent after telling you the value of D, then you would respond that they are dependent. Now the last leg of the path (ADB) is dependent, because A and B explain each other away.
The general ability to look at a complicated causal diagram is a valuable tool, and we will come back to it in the future.
Next, I’ll talk about one of my current favorite applications of causal diagrams: Simpson’s paradox!
Previous: Correlation and causation
Next: Simpson’s paradox