基于因果自回归流模型的因果结构学习算法

doi:10.19678/j.issn.1000-3428.0066564

计算机工程 ›› 2024, Vol. 50 ›› Issue (3): 131-136. doi: 10.19678/j.issn.1000-3428.0066564

基于因果自回归流模型的因果结构学习算法

卢小金¹, 陈薇¹, 郝志峰¹^,², 蔡瑞初¹^,*()

1. 广东工业大学计算机学院, 广东广州 510006
2. 汕头大学理学院, 广东汕头 515063

收稿日期:2022-12-19 出版日期:2024-03-15 发布日期:2023-03-30
通讯作者: 蔡瑞初
基金资助:
国家自然科学基金(61876043); 国家自然科学基金(61976052); 国家自然科学基金(62206064); 科技创新2030-“新一代人工智能”重大项目(2021ZD0111501); 国家优秀青年科学基金(62122022)

Causal Structure Learning Algorithm Based on Causal Autoregressive Flow Model

Xiaojin LU¹, Wei CHEN¹, Zhifeng HAO¹^,², Ruichu CAI¹^,*()

1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, Guangdong, China
2. School of Science, Shantou University, Shantou 515063, Guangdong, China

Received:2022-12-19 Online:2024-03-15 Published:2023-03-30
Contact: Ruichu CAI

摘要/Abstract

摘要：

因果自回归流模型已经在非独立噪声等场景的因果方向推断问题上取得了一定的进展，但在多个结点的场景下仍存在全局结构搜索带来的准确度低和计算时间复杂度高的问题。面向非时序观察数据设计一种两阶段因果结构学习算法。在第一阶段，基于观测数据的条件独立性，对完全无向图通过条件独立性检验得到基本的因果骨架；在第二阶段，基于因果自回归流模型，通过标准化流的方法计算骨架中每条无向边在不同方向上的边缘似然概率，进而通过比较边缘似然概率进行因果方向推断。实验结果表明：该算法在多组不同参数生成的仿真因果结构数据集上均有较好的表现，与现有的主流因果结构学习算法相比，F1值平均提升15%~28%；在真实因果结构数据集实验中，该算法能够较为完整准确地学习到变量间的因果关系，与主流的因果结构学习算法相比，F1值平均提升28%~48%，具有更强的鲁棒性。

关键词: 因果结构学习, 因果发现, 加性噪声模型, 因果自回归流模型, 标准化流

Abstract:

The causal autoregressive flow model has realized promising results on the causal direction inference problem when the noise is affected by parent nodes. However, to date, existing methods suffer from low accuracy and high computational cost due to the global structure search. Therefore, in this study, a two-stage causal structure learning algorithm is designed for non-temporal observation data. The first stage involves obtaining the basic causal skeleton based on the conditional independence of the observed data from a completely undirected graph, and the second stage involves inferring causal direction by using normalizing flow to compare the edge likelihood probability in different directions based on the causal autoregressive flow model. The experiments on the simulated data shows that the proposed algorithm outperforms the existing mainstream causal structure learning algorithm, and the F1 score of the proposed algorithm is 15%-28% higher than the baseline methods. Similarly on the real world data, when compared with the mainstream causal learning algorithms, the proposed algorithm can learn the causal relationship more completely and accurately, and the F1 score of the proposed algorithm is 28%-48% higher than the baseline methods. Experimental results demonstrate the stronger robustness of the proposed algorithm.

Key words: causal structure learning, causal discovery, additive noise model, causal autoregressive flow model, normalizing flow

卢小金, 陈薇, 郝志峰, 蔡瑞初. 基于因果自回归流模型的因果结构学习算法[J]. 计算机工程, 2024, 50(3): 131-136.

Xiaojin LU, Wei CHEN, Zhifeng HAO, Ruichu CAI. Causal Structure Learning Algorithm Based on Causal Autoregressive Flow Model[J]. Computer Engineering, 2024, 50(3): 131-136.

http://www.ecice06.com/CN/Y2024/V50/I3/131

图/表 9

图1 SCARF算法框架

Fig.1 Framework of SCARF algorithm

图2 不同因果机制仿真数据的实验结果

Fig.2 Experimental results of different causal mechanisms simulation data

图3 不同的结点维度下的实验结果

Fig.3 Experimental results at different node dimensions

图4 不同平均入度下的实验结果

Fig.4 Experimental results at different average in-degrees

图5 不同样本数量下的实验结果

Fig.5 Experimental results at different sample sizes

图6 真实因果结构数据集中不同算法的F1值

Fig.6 F1 scores of different algorithms in true causal structure dataset

图7 真实因果结构数据集中不同算法的召回率

Fig.7 Recall of different algorithms in true causal structure dataset

图8 真实因果结构数据集中不同算法的准确率

Fig.8 Accuracy of different algorithms in true causal structure dataset

参考文献 24

1	SPIRTES P, ZHANG K. Causal discovery and inference: concepts and recent methodological advances. Applied Informatics, 2016, 3 (1): 1- 28. doi: 10.1186/s40535-015-0016-4
2	HERNAN M A, ROBINS J M. Causal inference. Boca Raton, USA: Chapman & Hall/CRC, 2010.
3	蔡瑞初, 陈薇, 张坤, 等. 基于非时序观察数据的因果关系发现综述. 计算机学报, 2017, 40 (6): 1470- 1490. URL
	CAI R C, CHEN W, ZHANG K, et al. A survey on non-temporal series observational data based causal discovery. Chinese Journal of Computers, 2017, 40 (6): 1470- 1490. URL
4	SPIRTES P, GLYMOUR C, SCHEINES R. Constructing Bayesian networks models of gene expression networks from microarray data[C]//Proceedings of Atlantic Symposium on Computational Biology. Atlantic, USA: [s. n.], 2000: 255-259.
5	GLYMOUR C, ZHANG K, SPIRTES P. Review of causal discovery methods based on graphical models. Frontiers in Genetics, 2019, 10, 524. doi: 10.3389/fgene.2019.00524
6	HOYER P, JANZING D, MOOIJ J M, et al. Nonlinear causal discovery with additive noise models. Advances in Neural Information Processing Systems, 2008, 21, 1- 9. URL
7	SHIMIZU S, HOYER P O, HYVÄRINEN A, et al. A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 2006, 7, 2003- 2030. URL
8	ZHANG K, HYVARINEN A. On the identifiability of the post-nonlinear causal model[EB/OL]. [2022-11-20]. https://arxiv.org/abs/1205.2599.pdf.
9	SPIRTES P, GLYMOUR C. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 1991, 9 (1): 62- 72. doi: 10.1177/089443939100900106
10	SPIRTES P, GLYMOUR C N, SCHEINES R, et al. Causation, prediction, and search. Cambridge, USA: MIT Press, 2000.
11	CHICKERING D M. Optimal structure identification with greedy search. Journal of Machine Learning Research, 2003, 3 (3): 507- 554.
12	ZHENG X, ARAGAM B, RAVIKUMAR P, et al. DAGs with NO TEARS: continuous optimization for structure learning[EB/OL]. [2022-11-20]. https://arxiv.org/abs/1803.01422.pdf.
13	NG I, GHASSAMI A, ZHANG K. On the role of sparsity and DAG constraints for learning linear DAGs. Advances in Neural Information Processing Systems, 2020, 33, 17943- 17954. URL
14	CAI R C, QIAO J E, ZHANG Z J, et al. SELF: structural equational likelihood framework for causal discovery[C]//Proceedings of AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI Press, 2018: 1787-1794.
15	乔杰, 蔡瑞初, 郝志峰. 基于级联加性噪声模型的因果结构学习算法. 计算机工程, 2022, 48 (1): 93- 98. URL
	QIAO J, CAI R C, HAO Z F. Causal structure learning algorithm based on cascade additive noise model. Computer Engineering, 2022, 48 (1): 93- 98. URL
16	KHEMAKHEM I, MONTI R, LEECH R, et al. Causal autoregressive flows[C]//Proceedings of International Conference on Artificial Intelligence and Statistics. [S. l.]: PMLR, 2021: 3520-3528.
17	姜枫, 朱辉生, 汪卫. 含隐变量非高斯无环因果模型的估计算法. 计算机工程, 2010, 36 (9): 178- 180. URL
	JIANG F, ZHU H S, WANG W. Estimation algorithm for non-Gaussian acyclic causal model with latent variables. Computer Engineering, 2010, 36 (9): 178- 180. URL
18	PEARL J. Models, reasoning and inference. Cambridge, UK: Cambridge University Press, 2000.
19	LAURITZEN S L. Graphical models. Oxford, USA: Clarendon Press, 1996.
20	KINGMA D P, SALIMANS T, JOZEFOWICZ R, et al. Improving variational inference with inverse autoregressive flow[EB/OL]. [2022-11-20]. https://arxiv.org/abs/1606.04934.pdf.
21	PAPAMAKARIOS G, PAVLAKOU T, MURRAY I. Masked autoregressive flow for density estimation[EB/OL]. [2022-11-20]. https://arxiv.org/abs/1705.07057.pdf.
22	HUANG C W, KRUEGER D, LACOSTE A, et al. Neural autoregressive flows[EB/OL]. [2022-11-20]. https://arxiv.org/abs/1804.00779.pdf.
23	DINH L, KRUEGER D, BENGIO Y. NICE: non-linear independent components estimation[EB/OL]. [2022-11-20]. https://arxiv.org/abs/1410.8516.pdf.
24	DURKAN C, BEKASOV A, MURRAY I, et al. Neural spline flows. Advances in Neural Information Processing systems, 2019, 32, 1- 18.

[1]	蔡瑞初, 伍运金, 陈薇, 郝志峰. 面向多元时间序列的群体因果关系发现算法[J]. 计算机工程, 2023, 49(2): 127-135.
[2]	郝志峰, 喻建华, 乔杰, 蔡瑞初. 基于结构方程似然框架的缺失值因果学习算法[J]. 计算机工程, 2023, 49(12): 63-70.
[3]	郝志峰, 陈正鸣, 谢峰, 陈薇, 蔡瑞初. 一种任意分布下的隐变量因果结构学习算法[J]. 计算机工程, 2022, 48(9): 121-129.
[4]	乔杰, 蔡瑞初, 郝志峰. 基于级联加性噪声模型的因果结构学习算法[J]. 计算机工程, 2022, 48(1): 93-98.

选择文件类型/文献管理软件名称

选择包含的内容

基于因果自回归流模型的因果结构学习算法

Causal Structure Learning Algorithm Based on Causal Autoregressive Flow Model

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 24

相关文章 4

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于因果自回归流模型的因果结构学习算法

Causal Structure Learning Algorithm Based on Causal Autoregressive Flow Model

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 24

相关文章 4

编辑推荐

Metrics

本文评价