CausalML

从经验出发通过归纳获得知识是常见的途径，然而这样的知识受限于观测，无法产生突破性的成果。当前的机器学习方法倾向于拟合数据，试图完美地学习过去，而不是发现随着时间的推移将持续存在的真实/因果关系。本文首先科普了因果理论的研究方向，科普了一些相关的概念，接着讨论了因果理论和机器学习结合点，最后提出了我们在因果理论上的应用设想。

因果理论研究：Causal Inference & Causal Discovery
因果和机器学习的结合：Causal RL，Causal LTR，Casual Domain Adaptation，Casual Stable Learning，Mediation等

Causal inference

Causal inference，预估某行为、因素的影响力或效益，即找到一个衡量变量之间因果关系的参数。根据数据产生途径差异，分为两类：通过有意控制、随机化的实验得到的，能够直接做 causal inference；通过观测数据得到的，后需要额外知道一些先验知识，才能在做 causal inference。适配数据分布差异，解决Selection bias，有很多因果推断的方法：

Re-weighting Methods

Propensity score based sample re-weighting：Labor Market Institutions and the Distribution of Wages
Improving predictive inference under covariate shift by weighting the log-likelihood function
Robust Importance Weighting for Covariate Shift
Reweighting samples under covariate shift using a Wasserstein distance criterion

Stratification Methods

Causal inference in statistics, social, and biomedical sciences

Matching Methods

Distance Matching：
- Bias reduction using Mahalanobis metric matching
- Genetic matching for estimating causal effects: A general multivariate matching method for achieving balance in observational studies
PSM
- Interval estimation for treatment effects using propensity score matching
- Combining propensity score matching with additional adjustments for prognostic covariates
更多详见：(02)-Matching

Tree-based Methods

Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
Orthogonal Random Forest for Causal Inference
Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees
Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects

Representation Learning Methods

Balanced representation learning: Estimating individual treatment effect: generalization bounds and algorithms
Weighted representation learning: Learning Weighted Representations for Generalization Across Designs
Domain adaptation + representation learning: Learning Representations for Counterfactual Inference
更多详见：(05)-Representation learning

Multitask learning Methods

Multi-task Gaussian process: Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes
Multi-task with Propensity-Dropout：Deep Counterfactual Networks with Propensity-Dropout
更多详见：(06)-Multitask learning

Meta-Learning Methods

S-Learner, T-Learner, X-Learner: Metalearners for estimating heterogeneous treatment effects using machine learning
R-Learner: Quasi-Oracle Estimation of Heterogeneous Treatment Effects
更多详见：(04)-Meta learning

Causal Discovery (Causal Structure Search)

Causal Discovery，从众多观测到/未观测到的变量中找出原因，即给定一组变量，找到他们之间的因果关系。大部分因果发现的方法基于因果图，介绍如下：

Graphical Models based

Constraint-based
- PC：Causation, Prediction, and Search
- FCI：Causation, Prediction, and Search
Score-based
- Greedy Equivalence Search(GES)：Optimal structure identification with greedy search
Functional causal models based
- Linear, non-Gaussian models：LiNGAM A linear non- Gaussian acyclic model for causal discovery
- Non-linear models：non-linear additive noise model (ANM) Nonlinear causal discovery with additive noise models , post-nonlinear causal model(PNL) On the identifiability of the post-nonlinear causal model

Temporal Causal Discovery Framework

因果发现的框架：Causal Discovery with Attention-Based Convolutional Neural Networks
更多详见：(09)-Causal Discovery

Causal with other domains

Causal with Recommendation

主要是对推荐数据的bias研究，推荐系统出现的各种偏差让其推荐非预期的 Item。一方面基于因果理论对排序模型进行优化，见LTR部分；另外，结合无偏的排序学习和衰减的点击模型、基于RL的策略梯度算法+off-policy correction 解决数据偏差的方法
Causal Embeddings for Recommendation
The Deconfounded Recommender: A Causal Inference Approach to Recommendation
Doubly Robust Joint Learning for Recommendation on Data Missing Not at Random
Top-K Off-Policy Correction for a REINFORCE Recommender System
Recommendations as Treatments: Debiasing Learning and Evaluation
Offline Recommender Learning Meets Unsupervised Domain Adaptation

Causal with LTR

LTR模型大多是基于用户反馈数据训练模型，这些数据大部分是隐式的，例如用户的点击、浏览、收藏、评论等，但这些数据存在许多偏差bias，如position bias和selection bias，基于因果理论，提出了Heckman rank，Propensity SVM rank，TrustPBM等做法
Unbiased Learning-to-Rank with Biased Feedback
Unbiased Learning to Rank with Unbiased Propensity Estimation
Addressing Trust Bias for Unbiased Learning-to-Rank
Correcting for Selection Bias in Learning-to-rank Systems

Causal with RL

因果和RL在很多方面有相似性，两者结合的方法通常有以下几种：去除强化学习算法里的混杂效应，在强化学习中应用反事实框架，因果表示学习，使用强化学习的方法进行因果发现
Deconfounding Reinforcement Learning in Observational Settings
Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search
Discovering and Removing Exogenous State Variables and Rewards for Reinforcement Learning
Structural Nested Models and G-estimation: The Partially Realized Promise
CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training

Other related

Causal Model Framework

因果模型的框架，包括Neyman-Rubin RCM、SCM和Po-calculus
The Neyman-Rubin Model of Causal Inference and Estimation Via Matching Methods
From Ordinary Differential Equations to Structural Causal Models- the deterministic case
A Potential Outcomes Calculus for Identifying Conditional Path-Specific Effects

Causal with Domain adaptation

领域自适应关注算法在测试集上的表现，经常对测试分布如何变化做出了一些假定，例如目标偏移，条件偏移和广义目标偏移。通过学习分布变化性的图表示并将领域适应视为推理问题，进行域适应或迁移学习。
Domain adaptation under target and conditional shift
Few-shot Domain Adaptation by Causal Mechanism Transfer
Domain adaptation with conditional transferable components
Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions
Domain adaptation under structural causal models
Domain adaptation as a problem of inference on graphical models

Causal with Time Series

假设因果关系是线性的且噪声是非高斯分布的，研究从subsampled data 和混杂时间序列中发现的因果关系
Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders
Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks

Causal Stable Learning

面对不同数据（OOD），学习有泛化能力的模型。如何学习稳健的预测模型，有以下几类方法：基于结构因果模型的方法（Structural causal model based methods）、基于分布鲁棒优化的方法（Distributionally robust optimization based methods）、基于样本加权的方法（Sample re-weighting based methods）
Causal inference using invariant prediction- identification and confidence intervals
Invariant Causal Prediction for Sequential Data
Learning Models with Uniform Performance via Distributionally Robust Optimization
Causally regularized learning with agnostic data selection bias
Stable Learning via Sample Reweighting
Stable Prediction with Model Misspecification and Agnostic Distribution Shift
Stable Prediction via Leveraging Seed Variable
Stable prediction across unknown environments
Latent Causal Invariant Model

Mediation

中介分析理论，和公平性、归因等问题相关，通过基于因果分析的改进来去除已知的伪相关，提取其直接因果效应，降低模型给出不公平决策的可能性
An interventionist approach to mediation analysis
On semiparametric estimation of a path-specific effect in the presence of mediator-outcome confounding

Applications & Resources

广泛应用于在线广告、营销、推荐、医疗、教育等，有一些公司提供了开源工具

FB：
- 推荐： Observational Data for Heterogeneous Treatment Effects with Application to Recommender Systems
Hulu：
- 广告优化： Doubly Robust Estimation of Causal Effects
- 用户/广告体验分析：Causal Inference at hulu
Uber：
- 用户体验：
  - Mediation Modeling at Uber: Understanding Why Product Changes Work (and Don’t Work)
  - Using Causal Inference to Improve the Uber User Experience
- 工具： CausalML: Python Package for Causal Machine Learning
阿里：
- 搜索广告： Estimating Individual Advertising Effect in E-Commerce
- 营销：
  - 阿里文娱智能营销增益模型 (Uplift Model) 技术实践
  - 因果推断在阿里文娱用户增长中的应用
腾讯：
- 广告价值度量： Uplift⼴告增效衡量⽅案
京东：
- MTA/DDA： Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network
EBay：
- MTA/DDA: Interpretable Deep Learning Model for Online Multi-touch Attribution
贝壳：
- 营销： Uplift-Model 在贝壳业务场景中的实践
Wayfair：
- 广告优化： Uplift modeling in Display Remarketing
- 工具：Pylift: A Fast Python Package for Uplift Modeling
Criteo：
- 推荐： Causal Embeddings for Recommendation
Linkedin:
- 商业活动价值验证： The Importance of Being Causal
- Causal inference from observational data: Estimating the effect of contributions on visitation frequency at LinkedIn
微软：
- 搜索广告：Causal Inference in the Presence of Interference in Sponsored Search Advertising
- 工具：
  - DoWhy:An End-to-End Library for Causal Inference
  - EconML
Huawei:
- 广告优化：Improving Ad Click Prediction by Considering Non-displayed Events
- **推荐场景反事实预估：**http://csse.szu.edu.cn/staff/panwk/publications/Conference-SIGIR-20-KDCRec.pdf
DeepMind： Algorithms for Causal Reasoning in Probability Trees

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Causal_ML_Framework.md

Causal_ML_Framework.md

CausalML

Causal inference

Re-weighting Methods

Stratification Methods

Matching Methods

Tree-based Methods

Representation Learning Methods

Multitask learning Methods

Meta-Learning Methods

Causal Discovery (Causal Structure Search)

Graphical Models based

Temporal Causal Discovery Framework

Causal with other domains

Causal with Recommendation

Causal with LTR

Causal with RL

Other related

Causal Model Framework

Causal with Domain adaptation

Causal with Time Series

Causal Stable Learning

Mediation

Applications & Resources

Files

Causal_ML_Framework.md

Latest commit

History

Causal_ML_Framework.md

File metadata and controls

CausalML

Causal inference

Re-weighting Methods

Stratification Methods

Matching Methods

Tree-based Methods

Representation Learning Methods

Multitask learning Methods

Meta-Learning Methods

Causal Discovery (Causal Structure Search)

Graphical Models based

Temporal Causal Discovery Framework

Causal with other domains

Causal with Recommendation

Causal with LTR

Causal with RL

Other related

Causal Model Framework

Causal with Domain adaptation

Causal with Time Series

Causal Stable Learning

Mediation

Applications & Resources