基于结构方程似然框架的缺失值因果学习算法

doi:10.19678/j.issn.1000-3428.0066474

摘要/Abstract

摘要：

探索事物之间的因果关系是数据科学的核心问题。在实际场景中，缺失值的存在给基于约束的方法和基于结构方程模型的方法带来巨大挑战。现有的缺失值因果学习方法虽然可以处理随机缺失数据上的因果结构学习问题，但是对于非随机缺失数据，学习因果结构网络中的因果对和马尔可夫等价类结构以及校正因缺失导致错误因果方向等仍未得到解决。为此，基于结构方程似然框架提出新的缺失值因果学习算法MV-SELF。利用非线性加性噪声模型的条件概率分布可以转换为噪声分布表示性质，设计一种基于最大化似然的评分，实现基于评分的因果结构搜索框架。同时，为解决非随机缺失下的因果结构学习问题，利用逆概率加权校正工具来恢复缺失数据的联合分布，从而校正因缺失导致的冗余边和错误因果方向，实现对缺失数据上的高维因果结构搜索。仿真实验结果表明，相比TD-PC、MVPC、Structure EM算法，MV-SELF的F1值提高了3%~19%，能有效区分马尔可夫等价类。

关键词: 结构方程似然框架, 缺失数据, 逆概率加权, 因果方向学习, 加性噪声模型

Abstract:

Exploring causal relationships between entities is crucial in data science. In practical scenarios missing values pose significant challenges to both constraint-based and structural equation model-based methods. Although existing causal learning methods effectively address random missing data, discerning causal structures in non-random missing data remains problematic. Challenges include learning causal pairs, identifying Markov equivalence class structures, and correcting causal direction errors in causal structure networks. To tackle these issues, this paper introduces a novel algorithm, MV-SELF, based on the structural equation likelihood framework. This algorithm transforms the conditional probability distribution of a nonlinear Additive Noise Model(ANM) into a representation of noise distribution. Consequently, it enables a maximum likelihood-based scoring mechanism for causal structure search. Additionally, MV-SELF utilizes Inverse Probability Weight(IPW)correction to counteract non-random deletions. This approach effectively restores the joint distribution of missing data, thereby correcting redundant edges and inaccurate causal directions. It facilitates high-dimensional causal structure searches in datasets with missing values. Simulation experiments reveal that MV-SELF outperforms TD-PC, MVPC, and Structure EM algorithms, achieving a 3% to 19% increase in F1 value. This improvement highlights MV-SELF's effectiveness in distinguishing Markov equivalence classes.

Key words: structural equation likelihood framework, missing datas, Inverse Probability Weight(IPW), causal discovery learning, Additive Noise Model(ANM)

郝志峰, 喻建华, 乔杰, 蔡瑞初. 基于结构方程似然框架的缺失值因果学习算法[J]. 计算机工程, 2023, 49(12): 63-70.

Zhifeng HAO, Jianhua YU, Jie QIAO, Ruichu CAI. Missing Value Causal Learning Algorithm Based on Structural Equation Likelihood Framework[J]. Computer Engineering, 2023, 49(12): 63-70.

http://www.ecice06.com/CN/Y2023/V49/I12/63

图/表 14

图1 缺失图的类型

Fig.1 Types of missing graphs

图2 基于结构方程似然框架的缺失值因果学习框架

Fig.2 Framework for causal learning of missing values based on structural equation likelihood framework

图3 缺失变量控制实验的F1值

Fig.3 F1 values of missing variables control experiment

图4 缺失变量控制实验的结构性汉明距离

Fig.4 Structural Hamming distance of missing variables control experiment

图5 结构维度控制实验的F1值

Fig.5 F1 values of structure dimension control experiment

图6 结构维度控制实验的结构性汉明距离

Fig.6 Structural Hamming distance of structure dimension control experiment

图7 平均入度控制实验的F1值

Fig.7 F1 values of average penetration control experiment

图8 平均入度控制实验的结构性汉明距离

Fig.8 Structural Hamming distance of average penetration control experiment

图9 样本数量控制实验的F1值

Fig.9 F1 values of numble of samples control experiment

图10 样本数量控制实验的结构性汉明距离

Fig.10 Structural Hamming distance of numble of samples control experiment

图11 真实结构对比实验的F1值

Fig.11 F1 values of real structure comparison experiment

图12 真实结构对比实验的准确率

Fig.12 Precision of real structure comparison experiment

图13 真实结构对比实验的召回率

Fig.13 Recall of real structure comparison experiment

参考文献 26

1	SPIRTES P, GLYMOUR C N, SCHEINES R, et al. Causation, prediction, and search[M]. Cambridge, USA: MIT Press, 2000.
2	乔杰, 蔡瑞初, 郝志峰. 基于级联加性噪声模型的因果结构学习算法. 计算机工程, 2022, 48(1): 93- 98. URL
	QIAO J, CAI R C, HAO Z F. Causal structure learning algorithm based on cascade additive noise model. Computer Engineering, 2022, 48(1): 93- 98. URL
3	郝志峰, 陈正鸣, 谢峰, 等. 一种任意分布下的隐变量因果结构学习算法. 计算机工程, 2022, 48(9): 121- 129. doi: 10.19678/j.issn.1000-3428.0062335
	HAO Z F, CHEN Z M, XIE F, et al. A learning algorithm for causal structure of hidden variables under arbitrary distribution. Computer Engineering, 2022, 48(9): 121- 129. doi: 10.19678/j.issn.1000-3428.0062335
4	SHPITSER I, MOHAN K, PEARL J. Missing data as a causal and probabilistic problem[EB/OL]. [2022-11-02]. http://www.auai.org/uai2015/proceedings/papers/204.pdf.
5	KRICH C, RUNGE J, MIRALLES D G, et al. Estimating causal networks in biosphere-atmosphere interaction with the PCMCI approach. Biogeosciences, 2020, 17(4): 1033- 1061. doi: 10.5194/bg-17-1033-2020
6	SHIMIZU S, HOYER P O, HYVÄRINEN A, et al. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 2006, 7, 2003- 2030.
7	ZHANG K, HYVÄRINEN A. On the identifiability of the post-nonlinear causal model[EB/OL]. [2022-11-02]. https://arxiv.org/ftp/arxiv/papers/1205/1205.2599.pdf.
8	TU R B, ZHANG K, ACKERMANN P, et al. Causal discovery in the presence of missing data[EB/OL]. [2022-11-02]. https://arxiv.org/pdf/1807.04010.pdf.
9	GLYMOUR C, ZHANG K, SPIRTES P. Review of causal discovery methods based on graphical models. Frontiers in Genetics, 2019, 10, 524. doi: 10.3389/fgene.2019.00524
10	LAM W, BACCHUS F. Learning Bayesian belief networks: an approach based on the MDL principle. Computational Intelligence, 1994, 10(3): 269- 293. doi: 10.1111/j.1467-8640.1994.tb00166.x
11	ZHANG K, PETERS J, JANZING D, et al. Kernel-based conditional independence test and application in causal discovery[EB/OL]. [2022-11-02]. https://arxiv.org/ftp/arxiv/papers/1202/1202.3775.pdf.
12	HORVITZ D G, THOMPSON D J. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 1952, 47(260): 663- 685. doi: 10.1080/01621459.1952.10483446
13	LIU Y, CONSTANTINOU A C. Greedy structure learning from data that contain systematic missing values. Machine Learning, 2022, 111(10): 3867- 3896. doi: 10.1007/s10994-022-06195-8
14	HOYER P O, JANZING D, MOOIJ J, et al. Nonlinear causal discovery with additive noise models[C]//Proceedings of the 21st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2008: 689-696.
15	ANDERSSON S A, MADIGAN D, PERLMAN M D. A characterization of Markov equivalence classes for acyclic digraphs. The Annals of Statistics, 1997, 25(2): 505- 541.
16	GAO E, NG I, GONG M, et al. MissDAG: causal discovery in the presence of missing data with continuous additive noise models[C]//Proceedings of Advances in Neural Information Processing Systems. New York, USA: ACM Press, 2022: 1-10.
17	CAI R C, QIAO J E, ZHANG Z J, et al. SELF: structural equational likelihood framework for causal discovery[C]//Proceedings of AAAI Conference on Artificial Intelligence. [S. l. ]: AAAI Press, 2018: 1-10.
18	MOHAN K, PEARL J, TIAN J. Graphical models for inference with missing data[C]//Proceedings of Advances in Neural Information Processing Systems. New York, USA: ACM Press, 2013, 26: 1-10.
19	MOHAN K, PEARL J. Graphical models for recovering probabilistic and causal queries from missing data[EB/OL]. [2022-11-02]. https://www.karthikamohan.com/nips2014_ reprint.pdf.
20	STROBL E V, VISWESWARAN S, SPIRTES P L. Fast causal inference with non-random missingness by test-wise deletion. International Journal of Data Science and Analytics, 2018, 6(1): 47- 62. doi: 10.1007/s41060-017-0094-6
21	GAIN A, SHPITSER I. Structure learning under missing data[C]//Proceedings of Machine Learning Research. New York, USA: ACM Press, 2018: 121-132.
22	FRIEDMAN N. Learning belief networks in the presence of missing values and hidden variables[C]//Proceedings of the 14th International Conference on Machine Learning. New York, USA: ACM Press, 1997: 125-133.
23	DEMPSTER A P, LAIRD N M, RUBIN D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 1977, 39(1): 1- 22. doi: 10.1111/j.2517-6161.1977.tb01600.x
24	RUBIN D B. Inference and missing data. Biometrika, 1976, 63(3): 581- 592. doi: 10.1093/biomet/63.3.581
25	蔡瑞初, 陈薇, 张坤, 等. 基于非时序观察数据的因果关系发现综述. 计算机学报, 2017, 40(6): 1470- 1490. URL
	CAI R C, CHEN W, ZHANG K, et al. A survey on non-temporal series observational data based causal discovery. Chinese Journal of Computers, 2017, 40(6): 1470- 1490. URL
26	TSAMARDINOS I, BROWN L E, ALIFERIS C F. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 2006, 65(1): 31- 78.

[1]	邵良杉, 赵松泽. 基于多模型融合的不完整数据分数插补算法[J]. 计算机工程, 2023, 49(9): 79-88, 98.
[2]	杨静, 陆铭华, 马洁琼, 吴金平, 刘星璇. 基于交替循环神经网络的水下防御态势预测方法[J]. 计算机工程, 2023, 49(9): 69-78.
[3]	乔杰, 蔡瑞初, 郝志峰. 基于级联加性噪声模型的因果结构学习算法[J]. 计算机工程, 2022, 48(1): 93-98.
[4]	邢玉莹,夏鸿斌,王涵. 缺失数据建模的改进型ALS在线推荐算法[J]. 计算机工程, 2018, 44(8): 212-217,223.
[5]	韩飞,沈镇林. 基于不完备集双聚类的缺失数据填补算法[J]. 计算机工程, 2016, 42(4): 20-26.
[6]	朱彦君，吴向阳. 基于张量分解的多维数据填充算法[J]. 计算机工程, 2014, 40(5): 45-48.
[7]	王凤梅, 胡丽霞. 一种基于近邻规则的缺失数据填补方法[J]. 计算机工程, 2012, 38(21): 53-55,62.
[8]	马捷, 钟子发, 史英春. 基于不完整数据的异常信号检测方法[J]. 计算机工程, 2011, 37(14): 88-90.
[9]	付惠娟, 任美睿, 李金宝, 郭龙江. 无线传感器网络中缺失数据的估计[J]. 计算机工程, 2011, 37(01): 90-92.
[10]	苏毅娟;钟智. 代价敏感的缺失数据有序填充算法[J]. 计算机工程, 2009, 35(17): 92-93,9.

选择文件类型/文献管理软件名称

选择包含的内容