Summary
There is a growing interest in the intersection of causal inference and machine learning. On one hand, ML methods  e.g., prediction methods, unsupervised methods, representation learning  can be adapted to estimate causal relationships between variables. On the other hand, the language of causality could lead to new learning criteria that yield more robust and fair ML algorithms. In this course, we'll begin with an introduction to the theory behind causal inference. Next, we’ll cover work on causal estimation with neural networks, representation learning for causal inference, and flexible sensitivity analysis. We’ll conclude with work that draws upon causality to make machine learning methods fair or robust. This is an advanced course and students are expected to have a strong background in ML.
Covid19 Related Updates
 Classes to return to inperson mode starting Jan 31.

From UdeM: Classes to be fully online until Jan. 31.
 Classes to begin week of Jan 10.
Class Information

Start date: Jan. 11

When: Tuesdays, 12:30 to 2:30 PM and Fridays, 11:30 to 1:30 PM

Where:

Inperson: Auditorium 2, 2nd floor, Mila building, 6650 Rue SaintUrbain.

On Zoom: link on Studium.
 Attend in person if possible.

Office hours: Tuesdays, 2:30 to 3:30 PM, Room F.04 in the Mila building.
Assigned Readings (updated often)
* = under resources on Piazza

Jan 14: Chapters 18, 19 and 20, Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi

Jan 18: Chapter 21 (on estimation), Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi

Jan 21: Chapter 1 (on potential outcomes), Causal Inference for Statistics, Social, and Biomedical Sciences*, by Guido Imbens and Donald Rubin

Jan 25: Identification of Causal Effects Using Instrumental Variables*, by Joshua Angrist, Guido Imbens and Donald Rubin

Jan 28: Chapter 4 (on counterfactuals), Causal Inference in Statistics: A Primer,* by Judea Pearl, Madelyn Glymour and Nicholas Jewell
 Adapting Text Embeddings for Causal Inference
 Adapting Neural Networks for the Estimation of Treatment Effects
 Valid Causal Inference with (Some) Invalid Instruments
 Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
 Sense and Sensitivity Analysis: Simple PostHoc Analysis of Bias Due to Unobserved Confounding
 A SimulationBased Test of Identifiability for Bayesian Causal Inference
 Proximal Causal Learning with Kernels: TwoStage Estimation and Moment Restriction
 Nonlinear causal discovery with additive noise models
 Causal Autoregressive Flows
 Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning
 Causal Discovery with General NonLinear Relationships Using NonLinear ICA
 Invariant Causal Prediction for Nonlinear Models
 Invariant Risk Minimization
 Invariance Principle Meets Information Bottleneck
 for OutofDistribution Generalization
 Nonlinear Invariant Risk Minimization: A Causal Approach
 Conditional variance penalties and domain shift robustness
 Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests
 Counterfactual fairness
 Counterfactual Risk Assessments, Evaluation, and Fairness
 Fairness in Risk Assessment Instruments: PostProcessing to Achieve Counterfactual Equalized Odds
 Robustly Disentangled Causal Mechanisms
 Towards Causal Representation Learning
Topics covered
 Introduction to causality
 Causal graphical models
 Defining causal quantities: interventions and counterfactuals
 Identifying causal quantities: graphical criteria, and instrumental variables
 Estimating causal quantities
 ML helps causality
 Adapting neural networks for estimation
 Learning representations for causal inference
 Sensitivity analysis
 Causal discovery
 Causality helps ML
 Defining disentanglement
 Criteria for better outofdistribution generalization
 Criteria for fair prediction
Evaluation
 30%  Reader reports for assigned readings
 70%  Final project report
Prerequisites
I will assume programming experience and familiarity with topics taught in
Fundamentals of Machine learning (or equivalent). Background in probabilistic graphical models will be useful.
Resources