Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Fast Causal Rule Mining Algorithm

  

  • Published:2026-05-26

快速因果规则挖掘算法

Abstract: Causal relationship mining aims to reveal latent causal mechanisms from complex data. Existing studies, mostly based on Bayesian network frameworks or simple filtration of association rules, generally face challenges such as low mining efficiency and difficulty in controlling unobserved confounding variables, resulting in insufficient accuracy and robustness of causal identification. To address this, this paper proposes a fast causal rule mining algorithm. This algorithm utilizes a prefix-tree structure for frequent pattern mining and integrates multiple pruning strategies to significantly enhance mining efficiency. Furthermore, it introduces a covariate mechanism and a matched transaction pair technique to effectively control confounding factors, thereby improving the reliability of causal rules. Experimental results demonstrate that the computational efficiency of the proposed algorithm is improved by 3 to 4 orders of magnitude compared to baseline algorithms. On large-scale datasets, its execution time is further reduced by 30%–50% compared to similar variants. In terms of accuracy, compared with baseline causal methods, the proposed algorithm maintains a stable Precision in the range of 0.69–0.90 and generally achieves an improvement of over 40%–60% in F1-score. These results fully validate the efficiency and superiority of the proposed algorithm in large-scale causal rule mining tasks.

摘要: 因果关系挖掘旨在从复杂数据中揭示潜在的因果机制。现有研究多依赖贝叶斯网络框架或对关联规则进行简单过滤,普遍面临挖掘效率低下及未观测混杂变量难以控制等瓶颈,严重制约了因果识别的准确性与鲁棒性。鉴于此,本文提出了一种快速因果规则挖掘算法。该算法基于前缀树结构优化频繁模式挖掘过程,并融合多种剪枝策略显著提升计算效率;同时,引入协变量机制与匹配事务对技术,有效消除混杂因素干扰,从而增强因果规则的可靠性。实验结果表明,该算法的计算效率较基准算法提升了3至4个数量级;在大规模数据集上,其运行时间较同类变体进一步缩短了30%–50%。在准确性方面,相较于基准因果发现方法,该算法的精确率稳定在0.69–0.90区间,F1分数普遍提升40%–60%以上。上述结果充分验证了该算法在大规模因果规则挖掘任务中的高效性与优越性。