Causal Programming

Thumbnail Image


Publication or External Link





Causality is central to scientific inquiry. There is broad agreement on the

meaning of causal statements, such as “Smoking causes cancer”, or, “Applying

pesticides affects crop yields”. However, formalizing the intuition underlying such

statements and conducting rigorous inference is difficult in practice. Accordingly,

the overall goal of this dissertation is to reduce the difficulty of, and ambiguity

in, causal modeling and inference. In other words, the goal is to make it easy for

researchers to state precise causal assumptions, understand what they represent,

understand why they are necessary, and to yield precise causal conclusions with

minimal difficulty.

Using the framework of structural causal models, I introduce a causation coeffi-

cient as an analogue of the correlation coefficient, analyze its properties, and create

a taxonomy of correlation/causation relationships. Analyzing these relationships

provides insight into why correlation and causation are often conflated in practice,

as well as a principled argument as to why formal causal analysis is necessary. Next,

I introduce a theory of causal programming that unifies a large number of previ-

ously separate problems in causal modeling and inference. I describe the use and

implementation of a causal programming language as an embedded, domain-specific

language called ‘Whittemore’. Whittemore permits rigorously identifying and esti-

mating interventional queries without requiring the user to understand the details

of the underlying inference algorithms. Finally, I analyze the computational com-

plexity in determining the equilibrium distribution of cyclic causal models. I show

this is uncomputable in the general case, under mild assumptions about the distri-

butions of the model’s variables, suggesting that the structural causal model focus

on acyclic causal models is a ‘natural’ limitation. Further extensions of the concept

will have to give up either completeness or require the user to make additional —

likely parametric — model assumptions.

Together, this work supports the thesis that rigorous causal modeling and

inference can be effectively abstracted over, giving a researcher access to all of

the relevant details of causal modeling while encapsulating and automating the

irrelevant details of inference.