There is a growing interest in the intersection of causal inference and machine learning. On one hand, ML methods --- e.g., prediction methods, unsupervised methods, representation learning --- can be adapted to estimate causal relationships between variables. On the other hand, the language of causality could lead to new learning criteria that yield more robust and fair ML algorithms. In this course, we'll begin with an introduction to the theory behind causal inference. Next, we’ll cover work on causal estimation with neural networks, representation learning for causal inference, and flexible sensitivity analysis. We’ll conclude with work that draws upon causality to make machine learning methods fair or robust. This is an advanced course and students are expected to have a strong background in ML.
Covid-19 Related Updates
- Classes to return to in-person mode starting Jan 31.
From UdeM: Classes to be fully online until Jan. 31.
- Classes to begin week of Jan 10.
Start date: Jan. 11
When: Tuesdays, 12:30 to 2:30 PM and Fridays, 11:30 to 1:30 PM
In-person: Auditorium 2, 2nd floor, Mila building, 6650 Rue Saint-Urbain.
On Zoom: link on Studium.
- Attend in person if possible.
Office hours: Tuesdays, 2:30 to 3:30 PM, Room F.04 in the Mila building.
Assigned Readings (updated often)
* = under resources on Piazza
Jan 14: Chapters 18, 19 and 20, Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi
Jan 18: Chapter 21 (on estimation), Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi
Jan 21: Chapter 1 (on potential outcomes), Causal Inference for Statistics, Social, and Biomedical Sciences*, by Guido Imbens and Donald Rubin
Jan 25: Identification of Causal Effects Using Instrumental Variables*, by Joshua Angrist, Guido Imbens and Donald Rubin
Jan 28: Chapter 4 (on counterfactuals), Causal Inference in Statistics: A Primer,* by Judea Pearl, Madelyn Glymour and Nicholas Jewell
- Adapting Text Embeddings for Causal Inference
- Adapting Neural Networks for the Estimation of Treatment Effects
- Valid Causal Inference with (Some) Invalid Instruments
- Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Sense and Sensitivity Analysis: Simple Post-Hoc Analysis of Bias Due to Unobserved Confounding
- A Simulation-Based Test of Identifiability for Bayesian Causal Inference
- Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction
- Nonlinear causal discovery with additive noise models
- Causal Autoregressive Flows
- Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning
- Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA
- Invariant Causal Prediction for Nonlinear Models
- Invariant Risk Minimization
- Invariance Principle Meets Information Bottleneck
- for Out-of-Distribution Generalization
- Nonlinear Invariant Risk Minimization: A Causal Approach
- Conditional variance penalties and domain shift robustness
- Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests
- Counterfactual fairness
- Counterfactual Risk Assessments, Evaluation, and Fairness
- Fairness in Risk Assessment Instruments: Post-Processing to Achieve Counterfactual Equalized Odds
- Robustly Disentangled Causal Mechanisms
- Towards Causal Representation Learning
- Introduction to causality
- Causal graphical models
- Defining causal quantities: interventions and counterfactuals
- Identifying causal quantities: graphical criteria, and instrumental variables
- Estimating causal quantities
- ML helps causality
- Adapting neural networks for estimation
- Learning representations for causal inference
- Sensitivity analysis
- Causal discovery
- Causality helps ML
- Defining disentanglement
- Criteria for better out-of-distribution generalization
- Criteria for fair prediction
- 30% -- Reader reports for assigned readings
- 70% -- Final project report
I will assume programming experience and familiarity with topics taught in Fundamentals of Machine learning
(or equivalent). Background in probabilistic graphical models will be useful.